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WITH  SPECIAL  APPLICATION  TO  TELEPHONE  CHANNELS* 


ABSTRACT 

The  objective  of  this  study  is  to  determine  the  performance,  ora  bound  on  the  performance, 
of  the  "best  possible"  method  for  digital  communication  over  fixed  time-continuous  channels 
with  memory,  i.e.,  channels  with  intersymbol  interference  and/or  colored  noise.  The  channel 
model  assumed  is  a  linear,  time-invariant  filter  followed  by  additive,  colored  Gaussian  noise. 
A  general  problem  formulation  is  introduced  which  involves  use  of  this  channel  once  for  T  sec¬ 
onds  to  communicate  one  of  M  signals.  Two  questions  are  considered:  (1)  given  a  set  of  sig¬ 
nals,  what  is  the  probability  of  error?  and  (2)  how  should  these  signals  be  selected  to  minimize 
the  probability  of  error?  It  is  shown  that  answers  to  these  questions  are  possible  when  a  suitable 
vector  space  representation  is  used,  and  the  basis  functions  required  for  this  representation  are 
presented.  Using  this  representation  and  the  random  coding  technique,  a  bound  on  the  proba¬ 
bility  of  error  for  a  random  ensemble  of  signals  is  determined  and  the  structure  of  the  ensemble 
of  signals  yielding  a  minimum  error  bound  is  derived.  The  inter-relation  of  coding  and  modula¬ 
tion  in  this  analysis  is  discussed  and  it  is  concluded  that:  (1)  the  optimum  ensemble  of  signals 
involves  an  impractical  modulation  technique,  and  (2)  the  error  bound  for  the  optimum  ensemble 
of  signals  provides  a  "best  possible"  result  against  which  more  practical  modulation  techniques 
may  be  compared.  Subsequently,  several  suboptimum  modulation  techniques  are  considered, 
and  one  is  selected  as  practical  for  telephone  channels.  A  theoretical  analysis  indicates  that 
this  modulation  system  should  achieve  a  data  rate  of  about  13,000  bits/second  on  a  data  grade 
telephone  line  with  an  error  probability  of  approximately  10  An  experimental  program  sub¬ 
stantiates  that  this  potential  improvement  could  be  realized  in  practice. 


Accepted  for  the  Air  Force 
Stanley  J.  Wisniewski 
Lt  Colonel ,  USAF 
Chief,  Lincoln  Laboratory  Office 


*  This  report  is  based  on  a  thesis  of  the  same  title  submitted  to  the  Department  of  Electrical 
Engineering  at  the  Massachusetts  Institute  of  Technology  in  October  1964,  in  partial  fulfillment 
of  the  requirements  for  the  Degree  of  Doctor  of  Philosophy. 


iii 


TABLE  OF  CONTENTS 


Abstract  iii 

CHAPTER  I  -  DIGITAL  COMMUNICATION  OVER  TELEPHONE  LINES  1 

A.  History  1 

B.  Current  Interest  1 

C.  Review  of  Current  Technology  1 

D.  Characteristics  of  Telephone  Line  as  Channel 

for  Digital  Communication  2 

E.  Mathematical  Model  for  Digital  Communication  over  Fixed  Time- 

Continuous  Channels  with  Memory  4 

CHAPTER  II  -  SIGNAL  REPRESENTATION  PROBLEM  7 

A.  Introduction  7 

B.  Signal  Representation  7 

C.  Dimensionality  of  Finite  Set  of  Signals  10 

D.  Signal  Representation  for  Fixed  Time-Continuous  Channels 

with  Memory  12 

CHAPTER  III  -  ERROR  BOUNDS  AND  SIGNAL  DESIGN  FOR  DIGITAL 
COMMUNICATION  OVER  FIXED  TIME-CONTINUOUS 
CHANNELS  WITH  MEMORY  37 

A.  Vector  Dimensionality  Problem  37 

B.  Random  Coding  Technique  37 

C.  Random  Coding  Bound  39 

D.  Bound  for  "Very  Noisy"  Channels  44 

E.  Improved  Low-Rate  Random  Coding  Bound  46 

F.  Optimum  Signal  Design  Implications  of  Coding  Bounds  53 

G.  Dimensionality  of  Communication  Channel  54 

CHAPTER  IV  -  STUDY  OF  SUBOPTIMUM  MODULATION  TECHNIQUES  57 

A.  Signal  Design  to  Eliminate  Intersymbol  Interference  58 

B.  Receiver  Filter  Design  to  Eliminate  Intersymbol  Interference  76 

C.  Substitution  of  Sinusoids  for  Eigenfunctions  82 


v 


CHAPTER  V  -  EXPERIMENTAL  PROGRAM  95 

A.  Simulated  Channel  Tests  95 

B.  Dial-Up  Circuit  Tests  98 

C.  Schedule  4  Data  Circuit  Tests  99 

APPENDIX  A  —  Proof  that  the  Kernels  of  Theorems  1  to  3  Are  101 

APPENDIX  B  —  Derivation  of  Asymptotic  Form  of  Error  Exponents  104 

APPENDIX  C  —  Derivation  of  Equation  (63)  107 

APPENDIX  D  —  Derivation  of  Equation  (70)  109 

APPENDIX  E  —  Proof  of  Even-Odd  Property  of  Nondegenerate 

Eigenfunctions  110 

APPENDIX  F  —  Optimum  Time-Limited  Signals  for  Colored  Noise  111 


vi 


DIGITAL  COMMUNICATION 

OVER  FIXED  TIME -CONTINUOUS  CHANNELS  WITH  MEMORY 
WITH  SPECIAL  APPLICATION  TO  TELEPHONE  CHANNELS 


CHAPTER  I 

DIGITAL  COMMUNICATION  OVER  TELEPHONE  LINES 


A.  HISTORY 

An  interest  in  low-speed  digital  communication  over  telephone  circuits  has  existed  for  many 

years.  As  early  as  1919,  the  transmission  of  teletype  and  telegraph  data  had  been  attempted 

1  2 

over  both  long-distance  land  lines  and  transoceanic  cables.  During  these  experiments  it  was 
recognized  that  data  rates  would  be  severely  limited  by  signal  distortion  arising  from  nonlinear 
phase  characteristics  of  the  telephone  line.  This  effect,  although  present  in  voice  communica¬ 
tion,  had  not  been  previously  noticed  due  to  the  insensitivity  of  the  human  ear  to  phase  distor¬ 
tion.  Recognition  of  this  problem  led  to  fundamental  studies  by  Carson/*'4  Nyquist,^'^  and 

others 7  From  these  studies  came  techniques,  presented  around  1930,  for  quantitatively  meas- 

8  9 

uring  phase  distortion  and  for  equalizing  lines  with  such  distortion  7  This  work  apparently 

resolved  the  existing  problems,  and  little  or  no  additional  work  appears  to  have  been  done  un¬ 
til  the  early  1950’s. 

B.  CURRENT  INTEREST 

The  advent  of  the  digital  computer  in  the  early  1950 ’s  and  the  resulting  military  and  com¬ 
mercial  interest  in  large-scale  information  processing  systems  led  to  a  new  interest  in  using 
telephone  lines  for  transmitting  digital  information.  This  time,  however,  the  high  operating 
speeds  of  these  systems,  coupled  with  the  possibility  of  a  widespread  use  of  telephone  lines, 

made  it  desirable  to  attempt  a  more  efficient  utilization  of  the  telephone  channel.  Starting  about 

10  11  12-14 

1954,  people  at  both  Lincoln  Laboratory  '  and  Bell  Telephone  Laboratories  began  in¬ 

vestigating  this  problem.  These  and  other  studies  were  continued  by  a  moderate  but  ever  in- 

15  -21 

creasing  number  of  people  during  the  late  1950’s.  By  1960,  numerous  systems  for  obtain- 

22  30 

ing  high  data  rates  (over  500  bits/second)  had  been  proposed,  built,  and  tested.  ~  These 

systems  were,  however,  still  quite  poor  in  comparison  to  what  many  people  felt  to  be  possible. 

Because  of  this,  and  due  also  to  a  growing  interest  in  the  application  of  coding  techniques  to 

3 1  -43 

this  problem,  work  has  continued  at  a  rapidly  increasing  pace  up  to  the  present.  Today 

it  is  necessary  only  to  read  magazines  such  as  Fortune,  Business  Week,  or  U.  S,  News  and 

44-47 

World  Report  to  observe  the  widespread  interest  in  this  use  of  the  telephone  network. 

C.  REVIEW  OF  CURRENT  TECHNOLOGY 

The  following  paragraphs  discuss  some  of  the  current  data  transmission  systems,  or 
modems  (modulator-demodulator),  and  indicate  the  basic  techniques  used  along  with  the  resulting 
performance.  Such  state-of-the-art  information  is  useful  in  evaluating  the  theoretical  results 
obtained  in  the  subsequent  analysis. 
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Two  numbers  often  used  in  comparing  digital  communication  systems  are  rate  R  in  bits 
per  second  and  the  probability  of  error  P  However,  the  peculiar  properties  of  telephone 
channels  (Sec.  I-D)  cause  the  situation  to  be  quite  different  here.  In  fact,  most  present  modems 
operating  on  a  telephone  line  whose  phase  distortion  is  within  "specified  limits"  have  a  P  of 
to  10  that  is  essentially  independent  of  rate  as  long  as  the  rate  is  below  some  maximum  value. 
As  a  result,  numbers  useful  for  comparison  purposes  are  the  maximum  rate  and  the  "specified 
limit"  on  phase  distortion;  the  latter  number  is  usually  defined  as  the  maximum  allowable  dif¬ 
ferential  delay  ^  over  the  telephone  line  passband. 

Probably  the  best-known  current  modems  are  those  used  by  the  Bell  Telephone  System  in 
the  Data-Phone  service.  At  present,  at  least  two  basic  systems  cover  the  range  from  500  to 
approximately  2400  bits/second.  The  simplest  of  these  modems  is  an  FSK  system  operating  at 
600  or  1200  bits/second.  Alexander,  Gryb,  and  Nast  have  found  that  the  600 -bits/second 
system  will  operate  without  phase  compensation  over  essentially  any  telephone  circuit  in  the 
country  and  that  the  1200-bits/second  system  will  operate  over  most  circuits  with  a  "universal" 

phase  compensator.  A  second  system  used  by  Bell  for  rates  of  about  2400  bits/second  is  a 

48 

single  frequency  four-phase  differentially  modulated  system.  At  present,  little  additional 

information  is  available  concerning  the  sensitivity  of  this  system  to  phase  distortion. 

Another  system  operating  at  rates  of  2400  to  4800  bits/second  has  been  developed  by  Rixon 
24  29 

Electronics,  Inc.  '  This  modem  uses  binary  AM  with  vestigial  side-band  transmission  and 

requires  telephone  lines  having  maximum  differential  delays  of  200  to  400  psec. 

27  49 

A  third  system,  the  Collins  Radio  Company  Kineplex,  '  has  been  designed  to  obtain  data 
rates  of  4000  to  5000  bits/second.  This  modem  was  one  of  the  first  to  use  signal  design  tech¬ 
niques  in  an  attempt  to  overcome  some  of  the  special  problems  encountered  on  the  telephone 
channel.  Basically,  this  system  uses  four-phase  differential  modulation  of  several  sinusoidal 
carriers  spaced  in  frequency  throughout  the  telephone  line  passband.  The  differential  delay 
requirements  for  this  system  are  essentially  the  same  as  for  the  Rixon  system  at  high  rate, 
i.e.,  about  200  psec. 

Probably  the  most  sophisticated  of  the  modems  constructed  to  date  was  used  at  Lincoln 
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Laboratory  in  a  recently  reported  experiment.  In  this  system,  the  transmitted  signal  was 

"matched"  to  the  telephone  line  so  that  the  effect  of  phase  distortion  was  essentially  eliminated.^ 

50 

Use  of  this  modem  with  the  SECO  machine  (a  sequential  coder-decoder)  and  a  feedback  channel 
allowed  virtually  error-free  transmission  at  an  average  rate  of  7500  bits/second. 

D.  CHARACTERISTICS  OF  TELEPHONE  LINE  AS  CHANNEL 
FOR  DIGITAL  COMMUNICATION 

Since  telephone  lines  have  been  designed  primarily  for  voice  communication,  and  since  the 
properties  required  for  voice  transmission  differ  greatly  from  those  required  for  digital  trans¬ 
mission,  numerous  studies  have  been  made  to  evaluate  the  properties  that  are  most  significant 
for  this  application  (see  Refs.  10,  12,  14,  17,  31,  33).  One  of  the  first  properties  to  be  recognized 
was  the  wide  variation  of  detailed  characteristics  of  different  lines.  However,  later  studies 


t  Absolute  time  delay  is  defined  to  be  the  derivative,  with  respect  to  radian  frequency,  of  the  telephone  line 
phase  characteristic.  Differential  delay  is  defined  in  terms  of  this  by  subtracting  out  any  constant  delay.  The 
differential  delay  for  a  "typical"  telephone  line  might  be  4  to  6  msec  at  the  band  edges. 

42 

t  This  same  approach  appears  to  have  been  developed  independently  at  IBM. 
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have  shown  that  only  a  few  phenomena  are  responsible  for  the  characteristics  that  affect  digital 
communication  most  significantly.  In  an  order  more  or  less  indicative  of  their  relative  impor¬ 
tance,  these  are  as  follows. 

Intersymbol  Interference^ Intersymbol  interference  is  a  term  commonly  applied  to  an 
undesired  overlap  (in  time)  of  received  signals  that  were  transmitted  as  separate  pulses.  This 
effect  is  caused  by  both  the  finite  bandwidth  and  the  nonlinear  phase  characteristic  of  the  tele¬ 
phone  line,  and  leads  to  significant  errors  even  in  the  absence  of  noise  or  to  a  significant  reduc¬ 
tion  in  the  signaling  rate.-f  It  is  possible  to  show,  however,  that  the  nonlinear  phase  character¬ 
istic  is  the  primary  source  of  intersymbol  interference. 

The  severity  of  the  intersymbol  interference  problem  can  be  appreciated  from  the  fact  that 
the  maximum  rate  of  current  modems  is  essentially  determined  by  two  factors:  (1)  the  sensi¬ 
tivity  of  the  particular  signaling  scheme  to  intersymbol  interference,  and  (2)  the  "specified 
limit"  on  phase  distortion;  the  latter  being  in  some  sense  a  specification  of  allowable  inter¬ 
symbol  interference.  In  none  of  these  systems  does  noise  play  a  significant  role  in  determining 
rate  as  it  does,  for  example,  in  the  classical  additive,  white  Gaussian  noise  channel,^1  Thus, 
current  modems  trade  rate  for  sensitivity  to  phase  distortion  —  a  higher  rate  requiring  a  lower 
"specified  limit"  on  phase  distortion  and  vice  versa. 

Impulse  and  Low-Level  Noise:—  Experience  has  shown  that  the  noise  at  the  output  of  a  tele- 
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phone  line  appears  to  be  the  sum  of  two  basic  types.  '  One  type  of  noise,  low-level  noise,  is 

typically  20  to  50  db  below  normal  signal  levels  and  has  the  appearance  of  Gaussian  noise  super¬ 
imposed  on  harmonics  of  60  cps.  The  level  and  character  of  this  noise  is  such  that  it  has  neg¬ 
ligible  effect  on  the  performance  of  current  modems.  The  second  type  of  noise,  impulse  noise, 

52 

differs  from  low-level  noise  in  several  basic  attributes.  First,  its  appearance  when  viewed 
on  an  oscilloscope  is  that  of  rather  widely  separated  (on  the  order  of  seconds,  minutes,  or 
even  days)  bursts  of  relatively  long  (on  the  order  of  5  to  50  msec)  transient  pulses.  Second,  the 
level  of  impulse  noise  may  be  as  much  as  10  db  above  normal  signal  levels.  Third,  impulse 
noise  appears  difficult  to  characterize  in  a  statistical  manner  suitable  for  deriving  optimum 
detectors.  Because  of  these  characteristics,  present  modems  make  little  or  no  attempt  to 
combat  impulse  noise;  furthermore,  impulse  noise  is  the  major  source  of  errors  in  these  sys¬ 
tems.  In  fact,  most  systems  operating  at  a  rate  such  that  intersymbol  interference  is  a  neg¬ 
ligible  factor  in  determining  probability  of  error  will  be  found  to  have  an  error  rate  almost 

4  6 

entirely  dependent  upon  impulse  noise  —  typical  error  rates  being  1  in  10  to  1  in  10  (Ref.  33). 

Phase  Crawl:—  Phase  crawl  is  a  term  applying  to  the  situation  in  which  the  received  signal 
spectrum  is  displaced  in  frequency  from  the  transmitted  spectrum.  Typical  displacements  are 
from  0  to  10  cps  and  arise  from  the  use  of  nonsynchronous  oscillators  in  frequency  translations 
performed  by  the  telephone  company.  Current  systems  overcome  this  effect  by  various  modu¬ 
lation  techniques  such  as  AM  vestigial  sideband,  differentially  modulated  FM,  and  carrier  re¬ 
covery  with  re-insertion. 

Dropout:—  The  phenomena  called  dropout  occurs  when  for  some  reason  the  telephone  line  ap¬ 
pears  as  a  noisy  open  circuit.  Dropouts  are  usually  thought  to  last  for  only  a  small  fraction  of  a 

t  Implicit  in  the  following  discussion  of  intersymbol  interference  is  the  assumption  that  the  telephone  line  is  a 
linear  device.  Although  this  may  not  be  strictly  true,  it  appears  to  be  a  valid  approximation  in  most  situations. 

$  Alternately,  and  equivalently,  intersymbol  interference  can  be  viewed  in  the  time  domain  as  arising  from  the 
long  impulse  response  of  the  line  (typically  10  to  15  msec  duration). 
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second,  although  an  accidental  opening  of  the  line  can  clearly  lead  to  a  much  longer  dropout. 

Little  can  be  done  to  combat  this  effect  except  for  the  use  of  coding  techniques. 

Crosstalk:—  Crosstalk  arises  from  electromagnetic  coupling  between  two  or  more  lines  in 
the  same  cable.  Currently,  this  is  a  secondary  problem  relative  to  intersymbol  interference 
and  impulse  noise. 

The  previous  discussion  has  indicated  the  characteristics  of  telephone  lines  that  affect  digital 
communication  most  significantly.  It  must  be  emphasized,  however,  that  present  modems  are 
limited  in  performance  almost  entirely  by  intersymbol  interference  and  impulse  noise.  The 
maximum  rate  is  determined  primarily  by  intersymbol  interference  and  the  probability  of  error 
is  determined  primarily  by  impulse  noise.  Thus,  an  improved  signaling  scheme  that  consider¬ 
ably  reduces  intersymbol  interference  should  allow  a  significant  increase  in  data  rate  with  a 

negligible  increase  in  probability  of  error.  Some  justification  for  believing  that  this  improve- 
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ment  is  possible  in  practice  is  given  by  the  experiment  at  Lincoln  Laboratory.  In  this  experi¬ 
ment,  a  combination  of  coding  and  a  signal  design  that  reduced  intersymbol  interference  allowed 
performance  significantly  greater  than  any  achieved  previously.  Even  so,  the  procedure  for 
combating  intersymbol  interference  was  ad  hoc.  Thus,  the  primary  objective  of  this  report  is 
to  obtain  a  fundamental  theoretical  understanding  of  optimum  signaling  techniques  for  channels 
whose  characteristics  are  similar  to  those  of  the  telephone  line. 

E.  MATHEMATICAL  MODEL  FOR  DIGITAL  COMMUNICATION  OVER  FIXED  TIME- 
CONTINUOUS  CHANNELS  WITH  MEMORY 

1.  Introduction 

Basic  to  a  meaningful  theoretical  study  of  a  real  life  problem  is  a  model  that  includes  the 
important  features  of  the  real  problem  and  yet  is  mathematically  tractable.  This  section  pre¬ 
sents  a  relatively  simple,  but  heretofore  incompletely  analyzed,  model  that  forms  the  basis  for 
the  subsequent  theoretical  work.  There  are  two  fundamental  reasons  for  this  choice  of  model: 

(a)  It  represents  a  generalization  of  the  classical  white,  Gaussian 
noise  channel  considered  by  Fano,^*  Shannon,^  and  others.  Thus, 
any  analysis  of  this  channel  represents  a  generalization  of  pre¬ 
vious  work  and  is  of  interest  independently  of  any  telephone  line 
considerations. 

(b)  As  indicated  previously,  the  performance  of  present  telephone 
line  communication  systems  is  limited  in  rate  by  intersymbol 
interference  and  in  probability  of  error  by  impulse  noise;  the 
low-level  "Gaussian"  noise  has  virtually  no  effect  on  system 
performance.  However,  the  frequent  occurrence  of  long  inter¬ 
vals  without  significant  impulse  noise  activity  makes  it  desirable 
to  study  a  channel  which  involves  only  intersymbol  interference 
(it  is  time  dispersive)  and  Gaussian  noise.  In  this  manner,  it 
will  be  possible  to  learn  how  to  reduce  intersymbol  interference 
and  thus  increase  rate  to  the  point  where  errors  caused  by  low- 
level  noise  are  approximately  equal  in  number  to  errors  caused 
by  impulse  noise. 

2.  Some  Considerations  in  Choosing  a  Model 

One  of  the  fundamental  aims  of  the  present  theoretical  work  is  to  determine  the  performance, 
or  a  bound  on  the  performance,  of  the  "best  possible"  method  for  digital  communication  over 
fixed  time-continuous  channels  with  memory,  i.e.,  channels  with  intersymbol  interference 
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and/or  colored  noise.  In  keeping  with  this  goal,  it  is  desirable  to  include  in  the  model  only 
those  features  that  are  fundamental  to  the  problem  when  all  practical  considerations  are  removed. 
For  example,  practical  constraints  often  require  that  digital  communication  be  accomplished  by 
the  serial  transmission  of  short,  relatively  simple  pulses  having  only  two  possible  amplitudes. 
The  theoretical  analysis  will  show,  however,  that  for  many  channels  this  leads  to  an  extremely 
inefficient  use  of  the  available  channel  capacity.  In  other  situations,  when  communication  over 
a  narrow-band,  bandpass  channel  is  desired,  it  is  often  convenient  to  derive  the  transmitted 
signal  by  using  a  baseband  signal  to  amplitude,  phase  or  frequency  modulate  a  sinusoidal  carrier. 
However,  on  a  wide-band,  bandpass  channel  such  as  the  telephone  line  it  is  not  a  priori  clear 
that  this  approach  is  still  useful  or  appropriate  although  it  is  certainly  still  possible.  Finally, 
it  should  be  recognized  that  the  interest  in  a  model  for  digital t  communication  implies  that  de¬ 
tection  and  decision  theory  concepts  are  appropriate  as  opposed  to  least-mean-square  error 
filtering  concepts  that  find  application  in  analog  communication. 


3.  Model 

An  appropriate  model  for  digital  communication  over  fixed  time-dispersive  channels  can 
be  specified  in  the  following  manner.  An  obvious  but  fundamental  fact  is  that  in  any  real  situa¬ 
tion  it  is  necessary  to  transmit  information  for  only  a  finite  time,  say  T  seconds.  This,  cou¬ 
pled  with  the  fact  that  a  model  for  digital  communication  is  desired,  implies  that  one  of  only  a 
finite  number,  say  M,  of  possible  messages  is  to  be  transmitted. t  For  the  physical  situation 
being  considered,  it  is  useful  to  think  of  transmitting  this  message  by  establishing  a  one-to-one 
correspondence  between  a  set  of  M  signals  of  T  seconds  duration  and  transmitting  the  signal 
that  corresponds  to  the  desired  message.  Furthermore,  in  any  physical  situation,  there  is  only 
a  finite  amount  of  energy,  say  ST,  available  with  which  to  transmit  the  signal.  (Implicit  here 
is  the  interpretation  of  S  as  average  signal  power.)  This  fact  leads  to  the  assumption  of  some 
form  of  an  energy  constraint  on  the  set  of  signals  —  a  particularly  convenient  constraint  being 
that  the  statistical  average  of  the  signal  energies  is  no  greater  than  ST.  Thus,  if  the  signals 
are  denoted  by  s.(t)  and  each  signal  is  transmitted  with  probability  P.,  the  constraint  is 


S.2(t)  dt  <:  ST 


(1) 


Next,  the  time-dispersive  nature  of  the  channel  must  be  included  in  the  model.  A  model 
for  this  effect  is  simply  a  linear  time-invariant  filter.  The  only  assumption  required  on  this 
filter  is  that  its  impulse  response  have  finite  energy,  i.e.,  that 


h^(t)  dt  <  °° 


(2) 


t  The  word  "digital"  is  used  here  and  throughout  this  work  to  mean  that  there  are  only  a  finite  number  of  possible 
messages  in  a  finite  time  interval.  It  should  not  be  construed  to  mean  "the  transmission  of  binary  symbols"  or 
anything  equally  restrictive. 


J  At  this  point,  no  practical  restrictions  will  be  placed  on  T  or  M.  So,  for  example,  perfectly  allowable  values 
for  T  and  M  might  be  T  =  3X10^  seconds  =*  1  year  and  M  =  10^  .  This  is  done  to  allow  for  a  very  general  for¬ 
mulation  of  the  problem.  Later  analysis  will  consider  more  practical  situations. 
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It  is  convenient,  however,  to  assume,  as  is  done  through  this  work,  that  the  filter  is  normalized 
so  that  max  |  H( f)  |  =  1,  where 


f 


h(t)  e"^1  dt 


H(f)  = 


Furthermore,  to  make  the  entire  problem  nontrivial,  some  noise  must  be  considered.  Since  the 
assumption  of  Gaussian  noise  leads  to  mathematically  tractable  results  and  since  a  portion  of  the 
noise  on  telephone  lines  appears  to  be  " approximately "  Gaussian,  this  form  for  the  noise  is  as¬ 
sumed  in  the  model.  Moreover,  since  actual  noise  appears  to  be  additive,  that  is  also  assumed. 
For  purposes  of  generality,  however,  the  noise  will  be  assumed  to  have  an  arbitrary  spectral 
density  N(f). 

Finally,  to  enable  the  receiver  to  determine  the  transmitted  message  it  is  necessary  to 
observe  the  received  signal  (the  filtered  transmitted  signal  corrupted  by  the  additive  noise)  over 
an  interval  of,  say  seconds,  and  to  make  a  decision  based  upon  this  observation^ 

In  summary,  the  model  to  be  analyzed  is  the  following:  given  a  set  of  M  signals  of  T  seconds 
duration  satisfying  the  energy  constraint  of  Eq.  (1),  a  message  is  to  be  transmitted  by  selecting 
one  of  the  signals  and  transmitting  it  through  the  linear  filter  h(t).  The  filter  output  is  assumed 
to  be  corrupted  by  (possibly  colored)  Gaussian  noise  and  the  receiver  is  to  decide  which  message 
was  transmitted  by  observing  the  corrupted  signal  for  seconds. 

Given  the  above  model,  a  meaningful  performance  criterion  is  probability  of  error.  On  the 
basis  of  this  criterion,  three  fundamental  questions  can  be  posed. 


Given  a  set  of  signals  {s.(t)},  what  form  of  decision  device  should 
be  used? 

What  is  the  resulting  probability  of  error? 

How  should  a  set  of  signals  be  selected  to  minimize  the  probability 
of  error? 
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The  answer  to  the  first  question  involves  well-known  techniques  and  will  be  discussed  only 


briefly  in  Chapter  III.  The  determination  of  the  answers  to  the  remaining  two  questions  is  the 
primary  concern  of  the  theoretical  portion  of  this  report. 

In  conclusion,  it  must  be  emphasized  that  the  problem  formulated  in  this  section  is  quite 
general.  Thus,  it  allows  for  the  possibility  that  optimum  signals  may  be  of  the  form  of  those 
used  in  current  modems.  The  formulation  has  not,  however,  included  any  practical  constraints 
on  signaling  schemes  and  thus  does  not  preclude  the  possibility  that  an  alternate  and  superior 
technique  may  be  found. 


t  At  this  point,  T]  is  completely  arbitrary.  Later  it  will  prove  convenient  to  assume  Tj  which  is  the  situation 
of  most  practical  interest. 
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CHAPTER  n 

SIGNAL  REPRESENTATION  PROBLEM 


A.  INTRODUCTION 

The  previous  section  presented  a  model  for  digital  communication  over  fixed  time-dispersive 
channels  and  posed  three  fundamental  questions  concerning  this  model  However,  an  attempt 
to  obtain  detailed  answers  to  these  questions  involves  considerable  difficulty.  The  source  of  this 
difficulty  is  that  the  energy  constraint  is  applied  to  signals  at  the  filter  input,  whereas  the  prob¬ 
ability  of  error  is  determined  by  the  structure  of  these  signals  at  the  filter  output.  Fortunately, 
the  choice  of  a  signal  representation  that  is  '’matched"  to  both  the  model  and  the  desired  analysis 
allows  the  presence  of  the  filter  to  be  handled  in  a  straightforward  manner.  The  following  sec¬ 
tions  present  a  brief  discussion  of  the  general  signal  representation  problem,  slanted,  of  course, 
toward  the  present  analysis,  and  provide  the  necessary  background  for  subsequent  work. 

B.  SIGNAL  REPRESENTATION 

At  the  outset,  it  should  be  mentioned  that  many  of  the  concepts,  techniques,  and  terminology 
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of  this  section  are  well  known  to  mathematicians  under  the  title  of  "Linear  Algebra." 

As  pointed  out  by  Siebert,^  the  fundamental  goal  in  choosing  a  signal  representation  for  a 
given  problem  is  the  simplification  of  the  resulting  analysis.  Thus,  for  a  digital  communication 
problem,  the  signal  representation  is  chosen  primarily  to  simplify  the  evaluation  of  probability 
of  error.  One  representation  which  has  been  found  to  be  extremely  useful  in  such  problems 
(due  largely  to  the  widespread  assumption  of  Gaussian  noise)  pictures  signals  as  points  in  an 
n-dimensional  Euclidean  vector  space. 


1.  Vector  Space  Concept 

In  a  digital  communication  problem  it  is  necessary  to  represent  a  finite  number  of  signals 
M.  One  way  to  accomplish  this  is  to  write  each  signal  as  a  linear  combination  of  a  (possibly 
infinite)  set  of  orthonormal  "basis"  functions  (<p^(t)},  i.e., 

n 

Sj(t)  =  Yi  sifiw  <3> 

i=l 

where 

sj  =  J  Vjft)  sj(t)  dt 


When  this  is  done,  it  is  found  in  many  cases  that  the  resulting  probability  of  error  analysis 
depends  only  on  the  numbers  s^  and  is  independent  of  the  basis  functions  {^(t)}.  In  such  cases, 
a  vector  or  n-tuple  s^.  can  be  defined  as 


-j  =  (sij’  S2j* 


...,  s,  .,...,  s  .) 
kj  nj 


which,  in  so  far  as  the  analysis  is  concerned,  represents  the  time  function  s ^  (t ) .  Thus,  it  is 

possible  to  view  s.  as  a  straightforward  generalization  of  a  three-dimensional  vector  and  s.(t) 
J  J 

as  a  vector  in  an  n-dimensional  vector  space.  The  utility  of  this  viewpoint  is  clear  from  its 
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widespread  use  in  the  literature.  Two  basic  reasons  for  this  usefulness,  at  least  in  problems 
with  Gaussian  noise,  are  clear  from  the  following  relations  which  are  readily  derived  from 
Eq.  (3).  The  energy  of  a  signal  s^ft)  is  given  by 

n 

S  s/<0  dl  ■  2  5ij  ■  Sj  ■  Sj 

i=  1 


and  the  cross  correlation  between  two  signals  s^(t)  and  s ^ (t )  is  given  by 


SjM 


c(t)  dt  =  £ 
i-  1 


s.  .s., 

ij  lk 


=  s. 
“J 


where  (  )  • 


)  denotes  the  standard  inner  product. 
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2.  Choice  of  Basis  Functions 


So  far,  the  discussion  of  the  vector  space  representation  has  been  concerned  with  basic  re¬ 
sults  from  the  theory  of  orthonormal  expansions.^  However,  an  attempt  to  answer  a  related 
question  —  how  are  the  <p t)  to  be  chosen  —  leads  to  results  that  are  far  less  well  defined  and  less 
well  known.  Fundamentally,  this  difference  arises  because  the  choice  of  the  <p.( t)  depends  heavily 
upon  the  type  of  analysis  to  be  performed,  i.e.,  the  <p^(t)  should  be  chosen  to  ’’simplify  the  anal¬ 
ysis  as  much  as  possible.”  Since  such  a  criterion  clearly  leads  to  no  specific  rule  for  deter¬ 
mining  the  <pj(t),  it  is  possible  only  to  indicate  situations  in  which  distinctly  different  basis  func¬ 
tions  might  be  appropriate. 


Band-Limited  Signals:—  A  set  of  basis  functions  used  widely  for  representing  band- 
limited  signals  is  the  set  of  <p  ^(t)  defined  by 

0  (t)  =  \T2W  sin2*w  [t  -  (i/2W)) 
n(t'  V^W  2ttW  [t  -  U/2W)] 


(4) 


where  W  is  the  signal  bandwidth.  The  popularity  of  this  representation,  the  so-called  sampling 
representation,  lies  almost  entirely  in  the  simple  form  for  the  coefficient  s.^.  This  is  readily 
shown  to  be 


s..  =  \  <p.(t)  s.(t)  dt  =  -L-  s.(i/2W)  .  (5) 

J_oo  1  J  2W  J 
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It  should  be  recalled,  however,  that  no  physical  signal  can  be  precisely  band-limited.  Thus, 
any  attempt  to  represent  a  physical  signal  by  this  set  of  basis  functions  must  give  only  an  ap¬ 
proximate  representation.  However,  it  is  possible  to  make  the  approximation  arbitrarily  accu¬ 
rate  by  choosing  W  sufficiently  large. 

Time-Limited  Signals:—  Time-limited  signals  are  often  represented  by  any  one  of 
several  forms  of  a  Fourier  series.  These  representations  are  well  known  to  engineers  and  any 
discussion  here  would  be  superfluous.  It  is  worth  noting  that  this  representation,  in  contrast 
to  the  sampling  representation,  is  exact  for  any  signal  of  engineering  interest. 

Arbitrary  Set  of  M  Signals:—  The  previously  described  representations  share  the 
property  that,  in  general,  an  infinite  number  of  basis  functions  are  required  to  represent  a 
finite  number  of  signals.  However,  in  problems  involving  only  a  finite  number  of  signals,  it 
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is  sometimes  convenient  to  choose  a  different  set  of  basis  functions  so  that  no  more  than  M  basis 
functions  are  required  to  represent  M  signals.  A  proof  that  such  a  set  exists,  along  with  the 

CQ 

procedure  for  finding  the  functions,  has  been  presented  by  Arthurs  and  Dym.  This  result,  al¬ 
though  well  known  to  mathematicians,  appears  to  have  just  recently  been  recognized  by  electrical 
engineers. 

Signals  Corrupted  by  Additive  Colored  Gaussian  Noise:—  In  problems  involving  signals 
of  T  seconds  duration  imbedded  in  colored  Gaussian  noise,  it  is  often  desirable  to  represent  both 
signals  and  noise  by  a  set  of  basis  functions  such  that  the  noise  coefficients  are  statistically 
independent  random  variables.  If  the  noise  autocorrelation  function^  is  R(r),  it  is  well  known 
from  the  Karhunen-Loeve  theorem54’^  that  the  <p.( t)  satisfying  the  integral  equation 

{  <P  At')  R(t  —  r')  dr  O^t^T 

°o 

form  such  a  set  of  basis  functions. 

Filtered  Signals:—  Consider  the  following  somewhat  artificial  situation  closely  related 
to  the  results  of  Sec.  II-D.  A  set  of  M  finite  energy  signals  defined  on  the  interval  [0,T]  is 
given.  Also  given  is  a  nonrealizable  linear  filter  whose  impulse  response  satisfies  h(t)  =  h(— t). 

It  is  desired  to  represent  both  the  given  signals  and  the  signals  that  are  obtained  when  these  are 
passed  through  the  filter  by  orthonormal  expansions  defined  on  the  interval  [0,T].  In  general, 
arbitrary  and  different  sets  of  (p  .(t)  can  be  chosen  for  both  representations.  Then  the  relation 
between  the  input  signal  vector  Sj  and  the  output  signal  vector,  say  £y  is  determined  as  follows: 
Let  rj(l)  be  the  filter  output  when  s ^ (t )  is  the  filter  input.  Then 

fT  ^  rr 

r.(t)  ^  \  s.(t)  h(t  -  t)  dr  =  /j  s.-j  \  “(t)  h(t  -  t)  dr 

J  Jq  J  J  Jq 

1 

where  |a.(t)}  are  the  basis  functions  for  the  input  signals.  Thus,  if  {y^(t)}  are  the  basis  functions 
for  the  output  signals,  it  follows  that 

r^(t)  yk(t)  dt  =  l  s.^  ^  yk(t)  h(t  -  r)  o.(t)  drdt 


or,  in  vector  notation, 

r.  =  [H]  s.  (6) 

-J  "J 

where  the  k,  ith  element  in  [H]  is 


T  r*T 


n 


yk(t) 


h(t  —  t)  a ^(t)  dr  dt 


and,  in  general, 
sional  matrix. 


r.  and  s.  are  infinite  dimensional  column  vectors  and  [H]  is  an  infinite  dimen- 


t  For  the  statement  made  here  to  be  strictly  true,  it  is  sufficient  that  R(t)  be  the  autocorrelation  function  of 
filtered  white  noised 
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If,  however,  a  common  set  of  basis  functions  is  used  for  both  input  and  output  signals  and 
if  these  <p  t)  are  taken  to  be  the  solutions  of  the  integral  equation^ 

W1)  =  H  ^(t)  h(t  -  t)  dT  O^t^T 

then  the  relation  of  Eq.  (6)  will  still  be  true  but  now  [H]  will  be  a  diagonal  matrix  with  the  eigen¬ 
values  X.  along  the  main  diagonal.  This  result,  which  is  related  to  the  spectral  decomposition 
of  linear  self-adjoint  operators,  1  has  two  important  features.  First,  and  most  obvious,  the 
diagonalization  of  the  matrix  [H]  leads  to  a  much  simplified  calculation  of  r^  given  s y  Equally 
important,  however,  this  form  for  [H]  has  entries  depending  only  upon  the  filter  impulse  re¬ 
sponse  h(t).  Although  not  a  priori  obvious,  these  two  features  are  precisely  those  required  of 
a  signal  representation  to  allow  evaluation  of  probability  of  error  for  digital  transmission  over 
time-dispersive  channels. 

C.  DIMENSIONALITY  OF  FINITE  SET  OF  SIGNALS 

This  section  concludes  the  general  discussion  of  the  signal  representation  problem  by  pre¬ 
senting  a  definition  of  the  dimensionality  of  a  finite  set  of  signals  that  is  of  independent  interest 
and  is,  in  addition,  of  considerable  use  in  defining  the  dimensionality  of  a  communication  channel. 

An  approximation  often  used  by  electrical  engineers  is  given  by  the  statement  that  a  signal 
which  is  "approximately"  time-limited  to  T  seconds  and  "approximately"  band-limited  to  W  cps 
has  2TW  "degrees  of  freedom";  i.e.,  that  such  a  signal  has  a  "dimensionality"  of  2TW.  This 
approximation  is  usually  justified  by  a  conceptually  simple  but  mathematically  unappealing  argu¬ 
ment  based  upon  either  the  sampling  representation  or  the  Fourier  series  representation  pre¬ 
viously  discussed.  However,  fundamental  criticisms  of  this  statement  make  it  desirable  to 
adopt  a  different  and  mathematically  precise  definition  of  "dimensionality"  that  overcomes  these 
criticisms  and  yet  retains  the  intuitive  appeal  of  the  statement.  Specifically,  these  criticisms 
are: 


CO 

(1)  If  a  (strictly)  band-limited  nonzero  signal  is  assumed,  it  is  known  that 
this  signal  must  be  nonzero  over  any  time  interval  of  nonzero  length. 

Thus,  any  definition  of  the  "duration"  T  of  such  a  signal  must  be  arbi¬ 
trary,  implying  an  arbitrary  "dimensionality,"  or  equally  unappealing, 
the  signal  must  be  considered  to  be  of  infinite  duration  and  therefore  of 
infinite  "dimensionality."  Conversely,  if  a  time-limited  signal  is  as¬ 
sumed,  it  is  known^  that  its  energy  spectrum  exists  for  all  frequencies. 
Thus,  any  attempt  to  define  "bandwidth"  for  such  a  signal  leads  to  sim¬ 
ilar  problems.  Clearly,  the  situation  in  which  a  signal  is  neither  band- 
limited  nor  time-limited,  e.g.,  s(t)  =  exp  [—  |t  |  ]  where  —  °°  <  t  <  «°,  leads 
to  even  more  difficult ies.t 

(2)  The  fundamental  importance  of  the  concept  of  the  "dimensionality"  of  a 
signal  is  that* it  indicates,  hopefully,  how  many  real  numbers  must  be 
given  to  specify  the  signal.  Thus,  when  signals  are  represented  as  points 
in  n-dimensional  space,  it  is  often  useful  to  define  the  dimensionality  of 

a  signal  to  be  the  dimensionality  of  the  corresponding  vector  space.  This 


t  Again,  there  are  mathematical  restrictions  on  h(t)  before  the  following  statements  are  strictly  true.  These 
conditions^ are  concerned  with  the  existence  and  completeness  of  the  <|>.(t)  and  are  of  secondary  interest  at 
this  point. 

t  It  should  be  mentioned  that  identical  problems  are  encountered  when  an  attempt  is  made  to  define  the  "dimen¬ 
sionality"  of  a  channel  in  a  similar  manner.  This  problem  will  be  discussed  in  detail  in  Chapter  III. 
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definition,  however,  may  lead  to  results  quite  different  from  those  ob¬ 
tained  using  the  concept  of  "duration"  and  "bandwidth."  For  example, 
consider  an  arbitrary  finite  energy  signal  s(t).  Then  by  choosing  for 
an  orthonormal  basis  the  single  function 


<Pd(  t)  = 


s(t) 


s2(t)  dt 


1/2 


it  follows  that 
s(t)  =  s1<p1(t) 

where,  as  usual, 

pOO 

Sj  =  J  s(t)  <P  1(t)  dt 


Thus,  this  definition  of  the  dimensionality  of  s(t)  indicates  that  it  is  only 
one  dimensional  in  contrast  to  the  arbitrary  (or  infinite)  dimensionality 
found  previously.  Clearly,  such  diversity  of  results  leaves  something  to 
be  desired. 


Although  this  discussion  may  seem  somewhat  confusing  and  puzzling,  the  reason  for  the 
widely  different  results  is  readily  explained.  Fundamentally,  the  time-bandwidth  definition  of 
signal  dimensionality  is  an  attempt  to  define  the  "useful"  dimensionality  of  the  vector  space  ob¬ 
tained  when  the  basis  functions  are  restricted  to  be  either  the  band-limited  sinx/x  functions  or 
the  time-limited  sine  and  cosine  functions.  In  contrast,  the  second  definition  of  dimensionality 
allowed  an  arbitrary  set  of  basis  functions  and  in  doing  so  allowed  the  <p^(t)  to  be  chosen  to  mini¬ 
mize  the  dimensionality  of  the  resulting  vector  space. 

In  view  of  the  above  discussions,  and  because  it  will  prove  useful  later,  the  following  defi¬ 
nition  for  the  dimensionality  of  a  set  of  signals  will  be  adopted: t 

Let  S  be  a  set  of  M  finite  energy  signals  and  let  each  signal  in  this  set  be 
represented  by  a  linear  combination  of  a  set  of  orthonormal  functions,  i.e., 

N 

Sjft)  =  Yi  for  a11  Sj(t)  e  s 

i=  1 

Then  the  dimensionality  d  of  this  set  of  signals  is  defined  to  be  the  minimum 
of  N  over  all  possible  basis  functions,  i.e., 

d  A  min  N 


The  proof  that  such  a  number  d  exists,  that  d^:  M,  and  the  procedure  for  finding  the  {<p.(t)} 

59  1 

have  been  presented  elsewhere  and  will  not  be  considered  here.  It  should  be  noted,  however, 

that  the  definition  given  is  unambiguous  and,  as  indicated,  is  quite  useful  in  the  later  work.  It 

is  also  satisfying  to  note  that  if  S  is  a  set  of  band-limited  signals  having  the  property  that 


s.(i/2W)  =  0 


for  all  i  <  1  or  i  >  2TW 
for  all  sj(t)  €  S 


t This  definition  is  just  the  translation  into  engineering  terminology  of  a  standard  definition  of  linear  algebra. 55,66 
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then  the  above  definition  of  dimensionality  leads  to  d  =  2TW.  A  similar  result  is  also  obtained 
for  a  set  of  time-limited  functions  whose  frequency  samples  all  vanish  except  for  a  set  of  2TW 
values. 

D.  SIGNAL  REPRESENTATION  FOR  FIXED  TIME -CONTINUOUS  CHANNELS 
WITH  MEMORY 

In  Sec.  I-E,  it  was  demonstrated  that  a  useful  model  for  digital  communication  over  fixed 
time-continuous  channels  with  memory  considers  the  transmission  of  signals  of  T-seconds  dura¬ 
tion  through  the  channel  of  Fig.  1  and  the  observation  of  the  received  signal  y(t)  for  T^  seconds. 
Given  this  model,  the  problem  is  to  determine  the  probability  of  error  for  an  optimum  detector 
and  a  particular  set  of  signals  and  then  to  minimize  the  probability  of  error  over  the  available 
sets  of  signals.  Under  the  assumption  that  the  vector  space  representation  is  appropriate  for 
this  situation,  there  remains  the  problem  of  selecting  the  basis  functions  for  both  the  transmitted 
signals  x(t)  and  the  received  signals  y(t). 


| 3-64-3094 | 


x (t )  • 


LINEAR  FILTER 

r(t)  / 

h(t) 

*  V 

■©- 


y(t) 


n(t)  GAUSSIAN  WITH  SPECTRAL  DENSITY  N(f) 
max  |  H (f ) |  *  I 

f  n(t) 

maxpH(f)|2/N(f)j  *1 


Fig.  1.  Time-continuous  channel  with  memory. 


If  x,  n,  and  y  are  the  (column)  vector  representations  of  the  transmitted  signal  (assumed 
to  be  defined  on  0,  T),  additive  noise,  and  received  signal  (over  the  observation  interval  of  T^ 
seconds),  respectively,  an  arbitrary  choice  of  basis  functions  will  lead  to  the  vector  equation 

y  =  [H]x  +  n  (7) 

in  which  all  vectors  are,  in  general,  infinite  dimensional,  n  may  have  correlated  components, 
and  [H]  will  be  an  infinite  dimensional  matrix  related  to  the  filter  impulse  response  and  the  sets 
of  basis  functions  selected.!-  If,  instead,  both  sets  of  basis  functions  are  selected  in  the  manner 
presented  here  and  elsewhere/*7  it  will  be  found  (1)  that  [H]is  a  diagonal  matrix  whose  entries  are 
the  square  root  of  the  eigenvalues  of  a  related  integral  equation,  and  (2)  that  the  components  of 
n  are  statistically  independent  and  identically  distributed  Gaussian  random  variables.  Because 
of  these  two  properties  it  is  possible  to  obtain  a  meaningful  and  relatively  simple  bound  on  prob¬ 
ability  of  error  for  the  channel  considered  in  this  work. 

In  the  following  discussion  it  would  be  possible,  at  least  in  principle,  to  present  a  single 
procedure  for  obtaining  the  desired  basis  functions  which  would  be  valid  for  any  filter  impulse 
response,  any  noise  spectral  density,  and  any  observation  interval,  finite  or  infinite.  This 
approach,  however,  leads  to  a  number  of  mathematically  involved  limiting  arguments  when 
white  noise  and/or  an  infinite  observation  interval  is  of  interest.  Because  of  these  difficulties 
the  following  situations  are  considered  separately  and  in  the  order  indicated. 

fit  is,  of  course,  possible  to  obtain  statistically  independent  noise  components  by  using  for  the  receiver  signal 
space  basis  functions  the  orthonormal  functions  used  in  the  Karhunen-Loeve  expansion  of  the  noise  How¬ 

ever,  this  will  not,  in  general,  diagonalize  [H]. 


12 


Arbitrary  filter,  white  noise,  arbitrary  observation  interval  (arbitrary  T^); 

Arbitrary  filter,  colored  noise,  infinite  observation  interval  (T^  =  °°); 

Arbitrary  filter,  colored  noise,  finite  observation  interval  (T^  <  °°). 

Due  to  their  mathematical  nature,  the  main  results  of  this  section  are  presented  in  the  form 
of  several  theorems.  First,  however,  some  assumptions  and  simplifying  notation  will  be 
introduced. 


Assumptions. 


(1)  The  time  functions  h(t),  x(t),  y(t),  and  n(t)  are  in  all  cases  real. 

(2)  The  input  signal  x(t)  may  be  nonzero  only  on  the  interval  [0,  T]  and  has 
finite  energy;  that  is,  x(t)  e  ^(O,  T)  and  thus 

x2(t)  dt  =  x2(t)  dt  <  00 


56 

(3)  The  filter  impulse  response  h(t)  is  physically  realizable  and  stable. 
Thus,  h(t)  =  0  for  t  <  0  and 


1 


|  h  (t )  |  dt  <  °° 


(4)  The  time  scale  for  y(t)  is  shifted  to  remove  any  pure  delay  in  h(t). 

(5)  The  observation  interval  for  y(t)  is  the  interval  [ 0,  T ^ ] ,  unless  otherwise 
indicated. 


Note:  Assumptions  3,  4,  and  5  have  been  made  primarily  to  simplify  the  proof  that  the  <p.(t) 
are  complete.  Clearly,  these  assumptions  cause  no  loss  of  generality  with  respect  to  ’’real 
world"  communication  problems.  Furthermore,  completeness  proofs,  although  considerably 
more  tedious,  are  possible  only  under  the  assumption  that 


h2(t)  dt  <  0° 


Notation. 


The  standard  inner  product  on  the  interval  [ 0,  T ]  is  written  (f,  g);  that  is. 


f(t)  g{ t)  dt 


The  linear  integral  operation  on  f(t,  s)  by  k(t,  r)  is  written  kf(t,  s);  that  is. 


kf(t,s) 


k(t,  t)  f(r,  s)  dr 


The  generalized  inner  product  on  the  interval  [  0,  T  ]  is  written  (f,  kg)T  ; 
that  is,  1  1 

pT  pT  p°° 

(f,  kg)T  1  f (t)  kg (t)  dt  =  \  1  f(t)  \  k(t,  t)  g(r)  dr  dt 

1  Jo  Jo  J-«> 

With  these  preliminaries  the  pertinent  theorems  can  now  be  stated.  The  following  results  are 
closely  related  to  the  spectral  decomposition  of  linear  self-adjoint  operators  on  a  Hilbert 
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space  ^2,63,66  It  should  be  mentioned  that  all  the  following  results  can  be  applied  directly  to 
time-discrete  channels  by  simply  replacing  integral  operators  with  matrix  operators. 


1.  Basis  Functions  for  White  Noise  and  an  Arbitrary  Observation  Interval 
Theorem  1. 


Let  N(f)  =  1,  define  a  symmetric  function  K(t,  s)  =  K(s,t)  in  terms  of  the  filter  impulse  re¬ 
sponse  by 


K(t,  s)  4 


h(T  —  t)  h(T  —  s)  dT 


0<;  t,  s^  T 


0 


elsewhere 


and  define  a  set  of  eigenfunctions  and  eigenvalues  by 


X.^.(t)  =  K<p .  (t) 


0<C  t<  T 
i  =  1,  2,  .  .  . 


(8) 


[Here  and  throughout  the  remainder  of  this  work,  it  is  assumed  that  <p^(t)  =  0  for  t  <  0  and 
t  >  T,  i  =  1,2,...  .  ]  Then  the  vector  representation  of  an  arbitrary  x(t)  e  £^(0,  T)  *s  w^ere 
x^  =  (x,  (p and  the  vector  representation  of  the  corresponding^  y(t)  on  the  interval  [0,  T^] 

(T^  ^,T)t  is  given  by  Eq.  (7)  in  which  the  components  of  n  are  statistically  independent  and  iden¬ 
tically  distributed  Gaussian  random  variables  with  zero-mean  and  unit  variance  and 


0 


The  basis  functions  for  y(t)  are  {©^(t)},  where 

1 


e.(t)  = 


hep .  (t ) 

Jh  1 


elsewhere 


.th 


and  the  i  component  of  y  is 


yi=(y,ei)Ti  • 

64  65 

Note:  To  be  consistent  with  the  literature,  '  it  is  necessary  to  denote  as  eigenfunctions 
only  those  solutions  of  Eq.  (8)  having  >  0.  This  restriction  is  required  since  mathematicians 
normally  place  the  eigenvalue  on  the  right-hand  side  of  Eq.  (8)  and  do  not  consider  eigenfunctions 
with  infinite  eigenvalue. 


t  The  representation  for  y(t)  neglects  a  noise  "remainder  term"  which  is  irrelevant  in  the  present  work.  See  the 
discussion  following  the  proof  of  Lemma  3  for  a  detailed  consideration  of  this  point. 

JAM  the  following  statements  will  be  true  if  Tj  <  T  except  that  the  ($.(t)}  will  not  be  complete  in  ^(O/T). 
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Proof. 

The  proof  of  this  theorem  consists  of  a  series  of  Lemmas.  The  first  Lemma  demonstrates 
that  eigenfunctions  of  Eq.  (8)  form  a  basis  for  £^(0,  T),  i.e.,  that  they  are  complete. 

Lemma  1(a). 

If  T^  T ,  the  set  of  functions  {<p^(t)}  defined  by  Eq.  (8)  form  an  orthonormal  basis  for  ^(O,  T), 
that  is,  for  every  x(t)  e  £^(0,  T) 

00 

x(t)  =  Yi  xi(PiW 
i=  1 

where  =  (x,  cp and  mean-square  convergence  is  obtained. 

Proof. 

Since  the  kernel  of  Eq.  (8)  is  £ ^  (see  Appendix  A)  and  symmetric,  it  is  well  known^4  that  at 
least  one  eigenfunction  of  Eq.  (8)  is  nonzero,  that  all  nonzero  and  nondegenerate  eigenfunctions 
are  orthogonal  (and  therefore  may  be  assumed  orthonormal)  and  that  degenerate  eigenfunctions 
are  linearly  independent  and  of  finite  multiplicity  and  may  be  orthonormalized.  Furthermore, 
it  is  known^1'^  that  the  {<p^(t)}  are  complete  in  £2(0,T)  if  and  only  if  the  condition 

(f,Kf)T  =  0  f(t)e£2(0,T) 


implies  f(t)  =  0  almost  everywhere  on  [0,  T].  But 


(f.  Kf)T^  =  j’1'1  f  (t)  h(t  -  r) 


dr  I  dt 


Thus,  to  prove  completeness,  it  suffices  to  prove  that  if 
^T 


C  f (t)  h(t  -  t)  dr  =  0 
Jo 


0^:  t.<:  Td  with  Ti  ^T 


then  f(t)  =  0  almost  everywhere  on  [0,  T).  Let 
-T 


u (t )  =  C  f(r)  h(t  —  t)  dt 
Jo 


and  assume  that 


u(t)  = 


z(t-Td) 


t«Td 


t  >  T  , 


where  z(t)  is  zero  for  t  <  0  and  is  arbitrary  elsewhere,  except  that  it  must  be  the  result  of 
passing  some  ^2(0,  T)  signal  through  h(t).  Then 

f°°  t  -sT 

U(s)  =  \  u (t )  e“st  dt  =  e  Z(s)  =  F(s)  H(s)  Refs)  ^  0  .  (9) 

Jo 


Now,  for  Refs]  >  0, 
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|Z(s)|  = 


poo 

\  z(t)  e’st  dt 


o  o 

However,  from  the  Schwarz  inequality, 


poo 

£  \  I  z(t)  |  exp  {— Re[s]  t}  dt 
Jo 

poo  p<X) 

|z(t)  I  dt  =  J  j  f(T)h(t-r)dr  dt 

pT  p°°  pT  p<» 

lf(T)l  J  |h(t  —  t  )  |  dt  dr  ^  J  |  f  (t)  |  dr  j  |h(t)|dt 


dr  <: 


f  i«rt| 

Jo 

and,  from  assumption  3, 

poo 

\  |  h  (t )  |  dt  <  oo 

J  on 


[s>f 


f  (T)  dT 


1/2 


Thus, 


|Z(s)|  £ 


1/2 


pi  -  1/2  p°° 

T\  r(r)  dr  \  |h(t)| 

°0  -e*' 


dt  <  oo  Re[s]  ^  0 


Since  f(t)  €  £2(0,T),  this  result  combined  with  Eq.  (9)  implies  that  there  exists  a  constant  A 
such  that 


a 


+ 


|  F(s)  H(s)  |  A  exp  {-Re[s]  for  Re[s]  ^  0 

68 

From  a  Lemma  of  Titchmarsh  it  follows  that  there  exist  constants  a  and  fi  with 
ft  =  T^,  such  that 


|F(s)|  ^  exp{— Re[s]  a} 
|H(s)  I  ^  A2  exp{— Re[s]  0} 
whereA^A^A.  Finally,  since 


and  similarly 


F(s)  est  ds 


H(s)  est  ds 


Re[s]  ^  0 
Re[s]  0 


these  conditions  and  ordinary  contour  integration  around  a  right  half-plane  contour  imply  that 
f(t)  =  0  t  <  o 

h(t)  =  0  t  <  /? 


From  assumptions  3  and  4,  h(t)  is  physically  realizable  and  contains  no  pure  delay.  Thus,  by 
choosing  (3  =  0 ,  it  follows  that  a  =  and,  if  ^.T,  that  f(t)  =  0  almost  everywhere  on  [0,  T]. 
This  completes  the  proof  that  the  {<p^(t)}  are  complete. 
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The  following  Lemma  demonstrates  that  the  {0^(t)}  defined  in  the  statement  of  the  theorem 
are  a  basis  for  all  signals  at  the  filter  output,  i.e.,  all  signals  of  the  form 

x(t)  e  £2(0,T)  . 


r (t)  =  C  x(t)  h(t  -  t)  dr 
Jo 


Lemma  1(b). 

Let  r(t)  and  0^(t)  be  as  defined  previously.  Then 

oo 

r(t)  =  Yi  riei(t) 
i=  1 


where  r^  =  (r,  0^)rp^  =  JXT  x^,  x^  =  (x,  (p ^),  and  uniform  convergence^  is  obtained. 

Proof. 


Define  two  functions  rn(t)  and  xn(t)  by 


xn(t)  4  £  V>.(t)  o^t^T 
i=  1 


where 


and 


Then 


xi  =  (x,  <p.) 


rn(t)  4  xn(T)  h(t  -  x)  dt  0^t^T1 


I  r (t)  —  r  (t)  |  = 


J  [x(t)  -  xn(r)|  h(t  -  t)  dr 


|x(r)-xn(r)|  | h (t  —  t)  |  dT 


Thus,  from  the  Schwarz  inequality, 

-T 


|r(t)  -  rn(t)|  2  J  |  x(r)  -  xn(r)  |  2  dr  J  h2(t  -  r)  dr 

f' 

However,  by  assumption 

r  h2 


<:  \  |x(t)  -  x  (r)  I  2  dT  C  h2(t)  dt 

o 


"(t)  dt  <  °° 
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and  Lemma  1(a)  proved  that 


r1 

lim  \  | x(t)  —  x  (t)  |  ^  =  0 

n— 00  Jo  n 


Thus, 


lim  |r(t)  -  rn(t)  |  =  0 


n-*00 

and  uniform  convergence  is  obtained.  Since 

n  n 


it  follows  that 


rn(t)  =  Z  xih^i<t)  =  Z  V*ixiei(t) 

i=  1  i=  1 


r(t)  =  Z  xiei(t) 

i=  1 


and  therefore  that 


*r-  ®j>T  *  2  Jh^yVr, 

1  i=l  1 


But 


(e-e-)T  =— =  (h*.,h*.)  =—==&*. 

J  1  1 1  fxT.  J  1  1 1  /\X  J 

v  i  j  v  i  j 


/T 

K<p.)  =  hr~  ((p <?.)  =  6.. 

J  XYj*  ^l'  lj 


Thus, 


r(t)  =  Z  (r.O,)T  6  (t) 

i=l  1 


and  the  Lemma  is  proved. 

The  following  Lemma  presents  the  pertinent  facts  relating  to  the  representation  of  n(t)  by 
the  functions  {O .  (t )}. 

Lemma  1(c). 

Let  the  {0^(t)}  be  as  defined  above.  Then  the  additive  Gaussian  noise  n(t)  can  be  written  as 

oo 

n(t)  =  Z  n.e.(t)  +  nr(t) 
i=  1 


where 


ni  =  *n’  ei*T 


n.  =  0 
1 
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oo 


and  the  random  processes  2  n.0.(t)  and  n  (t)  are  statistically  independent. 

i=  1 

Proof.t 

Defining 


nr(t)  =  n(t)  -  Yi 
i-  1 

it  follows  that  n(t)  can  be  represented  in  the  form  indicated.  By  assumption,  n(t)  is  a  zero-mean 
process.  Therefore, 


and 


n.  =  (n,  0.)  =  (n,  0.)T  =  0 

1  1 


n.n.  =  (n,  0.)T  (n,  0.)T  =  C *  1  T  1 * * *  n(T>  n(t)  0.(t)  0.(t)  dT  dt 

J  l  J  1  Jo  Jo  J 

=  fTl  \Tl  6(r -t)  e  (T)  e.(t)  drdt  =  (0  0  )  =6  . 

J0  J0  J  J  1  J 


(Note  that  unit  variance  is  obtained  here  due  to  the  normalization  assumed  in  Fig.  1.)  Next,  let 
ng(t)  be  defined  as 

OO 

ns(t)  =  Yj  "i9^)  0<t^T1 

i=  1 


and  define  ng  and  n^  by 
^s  =  1 

=  Inr(ti)'nr(t2)"  -'nr(tm)l  ’ 


— s  =  Ins(tl)’ns(t2) . ns(tn)] 


Then,  the  random  processes  ng(t)  and  n^(t)  will  be  statistically  independent  if  and  only  if  the 

joint  density  function  for  n  and  n  factors  into  a  product  of  the  individual  density  functions, 

s  r 

that  is,  if 


P(ng,  nr)  =  P^n,,)  P2(Dr>  for  a11  {tji  and  i1-} 


t  The  following  discussion  might  more  aptly  be  called  a  plausibility  argument  than  a  proof  since  the  series 

00 

I  n.9.(t)  does  not  converge  and  since  n(t)  is  infinite  bandwidth  white  noise  for  which  time  samples  do  not  exist. 

i=l  1  1 

However,  this  argument  is  of  interest  for  several  reasons:  (a)  it  leads  to  a  heuristically  satisfying  result,  (b)  the 

same  result  has  been  obtained  by  Price^  in  a  more  rigorous  but  considerably  more  involved  derivation,  and  (c)  it 

can  be  applied  without  apology  to  the  colored  noise  problems  considered  later. 
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However,  since  all  processes  are  Gaussian  it  can  be  readily  shown  that  this  factoring  occurs 

when  all  terms  in  the  joint  covariance  matrix  of  the  form  n  (t.)  n  ( t '-)  vanish.  From  the  previous 

si  r  j 

definitions 


ns(t)  nr(t')  = 


£n.0.(t)  n(V)  -  Z  n.e.tt') 
i  i 

pT  _ 

=  Z  \  1  n(T>  nd')  ej(T)  ©jd)  dr  -  Z  Z  n.n.e.(t)  6.(t' 

J  J 


1  1  J 

=  Z  ©id')  ©id)  -  Z  ©id')  ©jd)  =  o  . 

i  i 

Combining  the  results  of  these  three  Lemmas,  it  follows  that  for  any  x(t)  €  £7(0,T) 


xd)  =  Z  x^id) 


and 


yd)  =  Z  yi©id)  +  nrd) 

i 

where 

Xj  =  (x,  cp.) 

and 

yi=  (y.©i)Ti  =  V\Xi  +  ni 
ni=(n'ei)T1  • 

Thus,  only  the  presence  of  the  noise  "remainder  term"  nr(t)  prevents  the  direct  use  of  the 
vector  equation 

y  =  [H]  x  +  n 

It  will  be  found  in  all  of  the  succeeding  analyses,  however,  that  the  statistical  independence  of 
the  n  (t)  and  n  (t)  processes  would  cause  n  (t)  to  have  no  effect  on  probability  of  error.  Thus, 
for  the  present  work,  nr(t)  can  be  deleted  from  the  received  signal  space  without  loss  of  gener¬ 
ality.  This  leads  to  the  desired  vector  space  representation  presented  in  the  statement  of  the 
theorem.  Q.E.D. 

2.  Basis  Functions  for  Colored  Noise  and  Infinite  Observation  Interval 

This  section  specifies  basis  functions  for  colored  noise  and  a  doubly  infinite  observation  in¬ 
terval.  Since  many  portions  of  the  proof  of  the  following  theorem  are  similar  to  the  proof  of 
Theorem  1,  reference  will  be  made  to  the  previous  work  where  possible.  For  physical,  as  well 
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as  mathematical,  reasons  the  analysis  of  this  section  assumes  that  the  following  condition  is 
satisfied: 


i: 


I  H(f)  1 ' 

N(f) 


df  <  <*> 


Theorem  2. 


Let  the  noise  of  Fig.  1  have  a  spectral  density  N(f),  define  two  symmetric  functions  K(t  —  s) 
and  Kj  (t  —  s)  by t 

1 2 

exp  [jcj(t  —  s)]  df 


K(t  -  s)  4 


I H  (f )  | 


N(f) 


o  ^  t,  s^:  T 

elsewhere 


and 


s~tOO 

K^(t  —  s)  4  J  N(f)_1  exp[jaj(t  —  s)]  dt 


and  define  a  set  of  eigenfunctions  and  eigenvalues  by 


X.<p.(t)  =  K<Pi(t) 


O^t^T 


i  =  1,2.  .. 


(10) 


Then  the  vector  representation  of  an  arbitrary  x(t)  e  T)  is  x,  where  x^  =  (x,  <p and  the 

vector  representation  of  the  corresponding  y(t)  on  the  doubly  infinite  interval  [—“>,<»]  is  given 
by  Eq.  (7)  in  which  [H]  and  n  have  the  same  properties  as  in  Theorem  1.  The  basis  functions 
for  y(t)  are 


©i(t)  = 


-bh<p.(t) 


0  <  t  <  °° 


t  <  0 


.th 


and  the  i  component  of  y  is 

r*oo 

y{  =  <y, Kde .> 4  J  y(t)  iqe.jt)  dt  . 


Proof. 

Under  the  conditions  assumed  for  this  theorem,  the  functions  {^(t)}  of  Eq.  (10)  are  simply 
a  special  case  of  Eq.  (8)  with  =  +«.  Thus,  they  form  a  basis  for  £2(0,  T)  and  the  representa¬ 
tion  for  x(t)  follows  directly.  [See  Lemma  1(a)  for  a  proof  of  this  result  and  a  discussion  of  the 
convergence  obtained.]  By  defining  €L(t)  as  it  was  previously  defined,  it  follows  directly  from 
Lemma  1(b)  that  the  filter  output  r(t)  is  given  by 


t  Because  N(f)  is  an  arbitrary  spectral  density,  it  is  possible  that  N(f)“^  will  be  unbounded  for  large  f  and  there¬ 
fore  that  the  integral  defining  Kj(t  —  s)  will  not  exist  in  a  strict  sense.  It  will  be  observed,  however,  that 
Kj(t  —  s)  is  always  used  under  an  integral  sign  in  such  a  manner  that  the  over-all  operation  is  physically  mean¬ 
ingful.  It  should  be  noted  that  the  detection  of  a  known  signal  in  the  presence  of  colored  Gaussian  noise  involves 
an  identical  operation. ^4 
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r(t)  =  )_j  x.e.(t)  o,$t< 

i=  1 


Furthermore,  since 

<e  k  e.>  =-^<h?,K  hr> 

A1J  l\  \  1  1  J 


x.x. 

1  J 


orL  [l  *i(ff)h(t"a)d<7  It  K1(t-s)J  <pj(p)  h(s  -  p)  dp  ds 
.  pT  pT  p°°  p«= 

—  \  <P;(ct)  \  <PAp)\  \  h(t  -  CT)  K  (t  -  s)  h(s  -  p)  dt  ds  dffdp 

l.X.  JQ  J0  J  -°o  -oo  1 


dt 


=zr  C  C  <P,(p)  C  -ntTH —  exp  [jcj (cr  —  p)J  df  da  dp 

LX.  Jo  1  Jo  J  1  U; 

1  J 


/XX 

v  i  j 

it  follows  thatT 

<r-Kiei>  =  xi 

and  thus  that 


r(t)  =  E  <r.  KjG.)  0.(t) 
i=  1 

Finally,  with  $n(r)  and  m  defined  by 


(<P  i»  K<P  j’t  M  (^i.<Pj)  =  6ij 


✓noo 

!  (r)  4  \  N(f)  e^WT  df 
n  d  ^ 


nj  A<njKie.> 


it  follows  that 


nj  =  <n,  K1©i>  0 


and 


n.n.  ^n,  K.G.^^n,  K.0.S 

l  j  N  '  1  i'  '  ’  1  j/ 

noo  /-V»  /-»oo 

*  (t-s>\  K  (t  -  a)  e  (<r)  dcr  \  K  (s -p)  e  (p)  dpdtds 

•  oo  oO  ''-oo  *» 

/noo  />oo  rtaO  ✓noo 

=  \  \  0.(C7)  e  (p)  da  dp  \  \  SR  (t  -  s)  K  (t  -  <7)  K  (s  -  p)  dsdt 

''-oo  ''-oo  J  ''-OO  ''-oo 
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or 


ninj  =  J  J  6i(<7)  ej<p)  “  P)  dt7dP  =  <ei.  =  <5^ 

From  this  it  is  readily  shown,  following  a  procedure  identical  to  that  used  in  Lemma  1(c),  that 
the  random  processes  ng(t)  and  n^(t)  defined  by 

OO 

n  (t)  4  Y.  n.e.(t) 

s  u  11 

i- 1 

nr(t)  £  n(t)  -  ns(t) 

are  statistically  independent.  Thus,  neglecting  the  "remainder  term”  leads  to 


y(t)  =  E 

i=  1 


where 

yi  =  <y*  Kiei>  =  XJ  +  n. 

or,  in  vector  notation, 

y  =  [H]  x  +  n 


3.  Basis  Functions  for  Colored  Noise  and  Finite  Observation  Interval 

This  section  specifies  basis  functions  for  colored  noise  and  an  arbitrary,  finite  observation 
interval.  As  in  Sec.  II-D-2,  it  is  assumed  that 


lH(f)  I  2 

N(f) 


df  <  <*> 


Theorem  3. 

Let  the  noise  of  Fig.  1  have  a  spectral  density  N(f),  define  a  functiont  K^t,  s)  by 

y  1  ^n*1  “  s>  da  =  "  s)  0^:t,s^T1 


where 


5n(r)iJ  N(f)ejwTdf 


define  a  function  K(t,  s)  by 


K(t.s)  A 


f-p 

Jo  Jo 
0 


h(a  -  t)  K1  (a,  p)  h(p  -  s)  da  dp  0  t,  s  T 


elsewhere 


t  Comments  identical  to  those  in  the  footnote  to  Theorem  2  also  apply  to  this  ’’function." 
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and  define  a  set  of  eigenfunctions  and  eigenvalues  by 


\.<?.(t)  =  K^»i(t) 


0<<t<c  T 
i  =  1,  2,  . 


(11) 


Then  the  vector  representation  of  an  arbitrary  x(t)  e  is  x.  where  x^  =  (x,tp^)  and  the 

vector  representation  of  the  corresponding  y(t)  on  the  interval  [0,  T^]  is  given  by  Eq.  (7)  in  which 
[ H )  and  n  have  the  same  properties  as  in  Theorem  1.  The  basis  functions  for  y(t)  are 


ej(t)  = 


— pr  h(p  .  (t ) 

A 


.th 


and  the  i  component  of  y  is 

yiMy,Kiei)Ti 


0^t^T1 

elsewhere 


Proof. 

This  proof  consists  of  a  series  of  Lemmas.  The  first  Lemma  presents  an  interesting  prop¬ 
erty  of  a  complete  set  of  orthonormal  functions. 

Lemma  3(a). 

Let  {y.(t)}  be  an  arbitrary  set  of  orthonormal  functions  that  form  a  basis  for  ^^(0tT  ^),  i.e., 
they  are  complete  in  ^(O.T^).  Then 

oo 

E  Yj(t)  r^s)  <5(t  -  s)  o^t,  s<$T1 
i=l 


in  the  sense  that  for  any  f(t)  €  ^(O.T^) 


l.i.m. 
n— 00 


f(") 


n 

E  Vj(t)  y  •  (u- ) 

i=  1 


dcr  =  f(t) 


Proof. 


By  definition,  a  set  of  orthonormal  functions  {y .(t)}  that  are  complete  in  £^(0,  T^)  have  the 
property  that  for  any  f(t)  e  ^(O,  Tj) 


n 

f(t)  =  l.i.m.  Yj  (f. ri)T  °^t^T1 

n~°°  i=i  1 


that  is. 


p f(<7) 

r  n 

E  Vjfcr)  Vjft) 

Jo 

“i=  1 

da 


The  following  Lemma  provides  a  constructive  proof  that  the  function  K^(t,s)  defined  in  the 
theorem  exists. 
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Lemma  3(b). 

Define  a  function  Kn(t  —  s)  by 


K  (t  -  s)  4 


iR  (t  —  s) 
n 


t,  s<C  T 


elsewhere 


and  define  a  set  of  eigenfunctions  {y^t)}  and  eigenvalues  {/?.}  by 


W"  -  Vi(t) 


°^T! 


i  =  1,2,... 


Then 


y .  (t)  Tj(s) 

K1(t,s)=  I  — -  O^t.s^T 


i=  1 


Proof. 


64 

From  Mercer's  theorem  it  is  known  that 


rtn(t-s)=  /3.y.(t)  y.(s)  O^t,  s-^Tj 

i=  1 


Thus,  with 


v  Yi<t)yi(s) 

K1(t,s)=  l  - J -  0^t,ss<T1 

i=  1 


it  follows  that 


^  1  ^n(t  -  Ki(°r*s)  dor  ~  Tl  L  7T-  Yi(0  Y;(s)  (r.,yi)T 

Jo  .  .  pi  J  1  J  1  1 

i  J 


=  L  Yj(t)  Yj(s) 
i 

This  result,  together  with  Lemma  3(a)  and  the  known^  fact  that  the  (y^(t)}  are  complete,  finishes 
the  proof. 

Lemma  3(c). 


If  T  ^  T,  the  {(/?.(t)}  defined  by  Eq.(ll)  form  an  orthonormal  basis  for  £7(0,  T),  that  is,  for 
l  l 

every  x(t)  e  i^fO, 

OO 

X(t)  =  )j  x.<p.(t)  o^:  t  ^  T 

i=  1 

where 

x  =  (x,<p.)  and  convergence  is  mean  square, 
i  1 
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Proof. 

Since  K(t,  s)  is  an  f^-kernel,^  it  follows  from  the  proof  of  Lemma  1(a)  that  the  {<p^(t)}  are 
complete  in  £  0 ,  T)  if  and  only  if  the  condition 

(f,  Kf  )T  =  0 


implies  f(t)  =  0  almost  everywhere  on  [0,T-].  But 


(f,  Kf)T 


f (t)  h(a  —  t)  K1  (a,  p)  h(p  —  s)  f(s)  dt  ds  dir  dp 


Thus,  from  Lemma  3(b), 


(f,Kf)n 


f(t)  h(a  —  t)  y^(a)  dt  da 


and  it  follows  that  the  condition 
(f,  Kf )T  =  0 


implies 

y.(a)  da  =  0  i  =  1,  2,  .  .  . 

However,  the  completeness  of  the  y.(t)  implies  that  the  only  function  orthogonal  to  all  the  y.(t) 

^  6  5  ^ 

is  the  function  that  is  zero  almost  everywhere.  Thus  (f,  Kf)^  =  0  implies 

f(t)  h(a  —  t)  dt  =  0  almost  everywhere  on  [0,  T^] 

and,  from  the  proof  of  Lemma  1(a),  it  follows  that,  for  T^  ^  T,  f(t)  =  0  almost  everywhere  on  [0,T]. 

This  finishes  the  proof  that  the  {<P^(t)}  are  complete.  The  following  paragraph  outlines  a 
proof  of  the  remainder  of  the  theorem. 

With  0^(t)  as  previously  defined,  it  follows  directly  from  Lemmas  1(b)  and  3(c)  that  for  any 
x(t )  €  £2(0,T)  the  filter  output  r(t)  is  given  by 

oo 

r(t)=  E  7\xiei(t)  0^t^T1  . 

i=l 

Furthermore,  by  a  procedure  identical  to  that  in  Theorem  2,  it  follows  that 

<ei-  Kiej>Tl  -  -7=  ''"I-  K^j>x  =  sij 
v  i  j 

which  implies  that 


1 


r  r 


f(t)  h(a  -  t)  dt 


(r'Kiei>Tl  =  /\ 


t  See  Appendix  A. 


and  thus  that 


r(t)  =  Yj  (r.KjO.^  e.(t)  . 

i=l  1 


Next,  with  n.  defined  by 


it  follows  that 


n.  =  (n,  Kie.)T 


ni  <n'Kiei>T, 


and,  by  a  procedure  identical  to  that  in  Theorem  2,  that 


n.n.  =  (n,  K  6. )  (n,K,e.L  =(0.,K,e.)_  =6.. 
i  j  1  i'T1  1  ]'T1  l'  1  j'T1  xj 


From  this  it  is  readily  shown,  following  a  procedure  identical  to  that  in  Lemma  1(c),  that  the 
random  processes  ng(t)  and  n^(t)  defined  by 


ng(t)  A  Y  n.e.(t) 
i=  1 

nr(t)  £  n(t)  -  ng(t) 


0^t<T1 


are  statistically  independent.  Thus,  neglect  of  the  "remainder  term"  leads  to 

oo 

y(t)  =  Y 


i=  1 


where 

y .  =  (y,  K  .9.)^  =  /XT  x.  +  n. 
J  l  VJ '  1  l'T  v  l  l  i 

1 

or,  in  vector  notation 

y  =  [H]  x  +  n 

and  the  theorem  is  proved. 


4.  Interpretation  of  Signal  Representations 

The  previous  sections  have  presented  several  results  concerning  signal  representation  for 
the  channel  of  Fig.  1.  Since  these  results  have  of  necessity  been  presented  in  a  highly  mathemat¬ 
ical  context,  it  is  of  interest  to  interpret  these  in  terms  of  more  physical  engineering  concepts; 
in  particular,  it  is  desirable  to  interpret  these  in  terms  of  optimum  detectors  for  digital 
communication. 

Consider  first  the  {<p  .(t)}  of  Theorem  1.  The  first  and  foremost  property  of  these  functions 
is  that  they  are  solutions  of  the  integral  equation 

Vi(t)  =  vi(T)  K(t’T)  dT  0<Ct<:T  (12) 
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where 


K(t,  s)  =  \  1  h(a  -  t)  h(cj  -  s)  da 

Jo 

To  understand  the  physical  meaning  of  this  relation,  consider  the  optimum  detector  for  white 

noise  and  an  observation  interval  [ 0,  T ^ ]  when  a  time-limited  signal  x(t)  is  transmitted.  If  r(t) 

is  the  corresponding  channel  filter  output  on  [ 0,  T ^ ] ,  it  is  well  known  that  the  optimum  detector 

cross  correlates  r(t)  with  the  channel  output  over  the  interval  [0,T  ].  Interpreted  in  terms  of 

+  1 

linear  time -invariant  filters,  this  operation  can  be  performed  as  shown  in  Fig.  2  where  the 
matched  filter  has  been  realized  as  the  cascade  of  a  filter  matched  to  the  channel  filter  over  the 
interval  [0,T^]  followed  by  a  multiply-and-integrate  operation.  The  significance  of  the  fact  that 
the  {<^(t)}  are  a  solution  to  Eq.  (12)  is  now  made  clear  by  noting  that  the  (time -variant)  impulse 
response  of  the  cascade  of  the  channel  filter  and  the  " channel  portion"  of  the  matched  filter  is 

K(t,  s)  =  f  h((r  —  t)h(<7  —  s)  d<7  O^t,  s^T  (13) 

Jo 

where  K(t,s)  is  the  response  at  time  t  to  an  impulse  applied  at  time  s.  Thus,  the  {<p^(t)}  are 
simply  a  set  of  signals  that  are  self-reproducing  (to  within  a  gain  factor  X.)  over  the  interval 
[0,  T]  when  passed  through  the  filter  K(t,  s). 

1  3-64-3095  1 


n (t) 

V 


Fig.  2.  Concerning  interpretation  of  the  {<t>.(t)J  of  Theorem  1  . 

This  feature  or,  more  basically,  the  fact  that  the  {cp  .(t)}  are  a  solution  to  Eq.  (12),  causes 
the  {<pj(t)}  to  have  two  extremely  important  properties.  This  first  property,  that  the  {<p.(t)}are 
orthogonal  over  the  interval  [0,  T]  and  may  be  assumed  normalized,  is  readily  shown  in  the 
following  manner.  Assume  that  for  i  j,  X.  ^  Then  it  follows  that 

<P*(t)  X.<p  (t)  dt  =  f  <P.(t)  ^  <PAt)  K(t,  t)  drdt 
J  J  Jo  Jo  J 


I 


or 


Pj.  <P j)  =  (‘Pj.  K(pj) 

and  similarly  since  K(t,  s)  =  K(s,t), 


y  <Pj(t)  X.<^.(t)  dt  = 


<P  ^(t)  K(t,  t)  dr  dt 


(14) 


t  Here,  and  throughout  this  work,  the  question  of  the  physical  realizability  of  all  filters  other  than  the  channel 
filter  has  been  ignored. 

t  It  can  be  shown  via  an  argument  too  long  to  present  here  that  all  <j>j(t)  having  a  common  eigenvalue  can  be 
assumed  to  be  orthogonal  .64,65 


28 


<pj(t)  K(r,  t)  dtd r 


I  *j(t)  Vi(t)  dt  =  I  *i(T)I 


<pj)  =  <<Pi(  K<pj)  .  (15) 

Therefore,  upon  subtracting  Eq.  (14)  from  Eq.  (15)  it  follows  that 
(X.  -  X.)  ((p.,cp.)  =  0  . 

But,  by  assumption,  (X^  —  Xj  ^  0.  Thus  {(p^,(p^)  =  0  and  the  {<p.(t)}  are  orthogonal.  The  fact  that 
the  {<p  .(t)}  can  be  assumed  to  be  normalized  follows  directly  from  Eq.  (12)  by  observing  that  if 
<p.(t)  is  a  solution  to  this  equation,  then  c<p.(t)  is  also  a  solution  when  c  is  an  arbitrary  constant. 

The  second  property,  that  the  {<p^(t)}  are  orthogonal  over  the  interval  [0,T^]  after  passing 
through  the  filter,  follows  in  a  straightforward  manner.  Let  rj(t)  be  the  channel  filter  output 
when  ^j(t)  is  transmitted.  Then 


f  r.(t)  r.(t)  dt  =  f  1  C  (p  .(a)  h(t  -  a)  d(j  (  <p  Ap)  h(t  -  p)  dp  dt 
Jo  J  Jo  Jo  J 

=  C  f  <pA<r)  (pAp)  f  1  h(t  -  a)  h(t  -  p)  dt  dp  da 
Jo  Jo  1  J  Jo 

rT  \cT  1 

=  J  <^(<7)  lj  <pj(p)  K(a,p)  dp  da 

rT 


(16) 


Thus,  the  {^(t)}  are  orthogonal  after  passing  through  the  filter.  This  property  is  important 
because  it  allows  the  channel  memory  (its  time-dispersive  characteristic)  to  be  treated  analyti¬ 
cally  in  terms  of  a  number  of  parallel  and  independent  time-discrete  channels  with  different 
gains.  In  other  words,  when  the  transmitted  signal  is  written  as 


x(t)  =  Yl 


it  follows  from  Theorem  1  that  the  received  signal  can  be  written  as 


y(t)  =  Z  y^t) 


where 


6.(t)  ^  — —  f  <p  At)  h(t  -  r)  dr 
l' 


and,  due  to  the  fact  that  the  9-(t)  are  orthonormal  as  just  demonstrated  [^xT  9.(t)  =  r.(t)], 

Yi  =  (y>  ei)T  =  xi  +  ni 
1 


(17) 
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with 


This  is  simply  a  statement  that  to  obtain  y  each  coordinate  of  x  is  passed  through  an  independent 
time-discrete  additive  Gaussian  channel  with  gain  JJT  and  unity  noise  variance.  Thus,  for  pur¬ 
poses  of  analysis,  the  channel  of  Fig.  1  simplifies  to  that  of  Fig.  3. 


X 


i 
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Fig.  3.  Mathematical  equivalent 
of  channel  in  Fig.  1 . 


Finally,  it  is  of  interest  to  interpret  the  inner  product  for  y^  in  terms  of  Fig.  Z.  From 
Eq.  (17),  y.  is  given  by 

yi  =  (y'  ei>T  =  I”  1  y(t)  6i(t)  dt 

1  Jo  1 


fT  pT  q>  (t) 

=  \  y(t)  \  — —  h(t  -  r)  dr  dt 


h(t  —  t)  dt  dr 


(18) 


Comparing  Eq.  (18)  with  the  filtering  operations  indicated  in  Fig.  Z  shows  that  y.  is  precisely 
1/JxT  times  the  output  of  the  optimum  (matched-filter)  detector  when  <p^(t)  is  transmitted.  This 
result  becomes  important  in  practice  when  a  large  number  (M  »  d)  of  d-dimensional  signals  are 
to  be  transmitted,  since  the  construction  of  d  " coordinate  filters"  is  much  simpler  than  the  con¬ 
struction  of  M  different  matched  filters. 

The  previous  discussion  has  shown  that  in  spite  of  the  highly  mathematical  nature  of  the 
signal  representation  of  Theorem  1,  it  is  possible  to  readily  understand  the  important  properties 
of  the  {^(t)}  by  interpreting  them  in  terms  of  optimum  detectors.  An  analogous  discussion  of  the 
properties  of  the  {cp^(t)}  of  Theorem  Z  follows. 

As  before,  the  {<p^(t)}  are  defined  as  solutions  to  the  integral  equation 

Vi(t)  =  yT  <pi(T)  K(t'T)  dT  0^t<$T  (19) 


where  K(t,  s)  is  now  given  by 

K(t,  s)  =  K(t  —  s)  =  J  ~Wff)'  expljwCt  -  s)|  df 

To  obtain  a  physical  understanding  of  Eq.  (19),  consider  the  optimum  detector  for  colored  noise 
and  a  doubly  infinite  observation  interval  when  a  time-limited  signal  x(t)  is  transmitted.  It  is 
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x(t) 


n(t),N(f) 


x{t) 


‘’WHITENED"  CHANNEL 


Fig.  4.  Concerning  interpretation  of  the  (<j>.(t)}  of  Theorem  2. 
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known  that  the  optimum  detector  cross  correlates  the  channel  output  y(t)  with  a  function  q(t) 
over  the  doubly  infinite  interval,  where 


X(f)  H(f) 
N(f) 


df 


Interpreted  in  terms  of  linear  time -invariant  filters,  this  operation  can  be  performed  as  shown 
in  Fig.  4.  Note  that  the  optimum  detector  has  been  formed  here  as  the  cascade  of  a  "prewhitening" 
filter,  a  filter  matched  to  the  "whitened"  channel  filter  characteristic  and  a  multiply-and -integrate 
operation.  Upon  observing  that  the  impulse  response  of  the  cascade  of  the  channel  filter  and  the 
"channel  portion"  of  the  optimum  detector  is  given  by 


K(t)  = 


eja,t  df 


it  follows  from  Eq.  (19)  that  the  {<P^(t)}  are  simply  a  set  of  signals  that  are  self-reproducing  (to 
within  a  gain  factor  X.)  over  the  interval  (0,  T]  when  passed  through  the  filter  K(t).  As  before, 
this  characteristic  causes  the  {(p^(t)}to  have  two  important  properties.  The  first  property,  that 
the  {<p^(t)}  are  orthonormal,  follows  directly  from  the  discussion  of  the  {<p^(t)}  of  Eq.  (12).  The 
second  property,  that  the  {<p^(t)}  are  orthogonal  at  the  channel  output  with  respect  to  a  generalized 
inner  product,  has  been  demonstrated  in  the  proof  of  Theorem  2.  A  more  physical  interpreta¬ 
tion  of  the  latter  property  can  be  obtained  by  investigating  the  orthogonality  of  the  <p  t)  at  the 
output  of  the  "whitened"  channel.  Let  r^(t)  be  the  output  of  this  channel  when  <p^(t)  is  transmitted; 
let  9^(t)  and  K^ft)  be  defined  as  in  Theorem  2,  i.e., 

e.(t)  =  — L  C  h(t  -  t)  dr 

7V° 

and 

Kl<t)  =  ejW  df 


and  let  hw(t)  be  the  impulse  response  of  the  "prewhitening"  filter.  Then 

«l.oo  r dt  ”  :[£  e.«r)  hw(t  -  <r)  d<r|  | J  e ,(p)  hw(t  - p)  dp 

/"»oo  /~too  r*  oo 

=  JX.X.  J  J  e.(a)0j(p)j  hw(t -a)  hw(t -p)  dtdadp 


dt 


31 


r\  oo  noo  /"*oo 

J  rj (t)  r^t)  dt  =  JX.Xj  J  e.(<7)  J  e^(p)  -  p)  dp  d(7 

=  ^/OT  <e.,K,e.>=  x.6.. 

v  i  j  i  i  j'  i  ij 

where  the  last  line  follows  from  the  proof  of  Theorem  2.  Thus,  the  statement  that  the  <p.(t)  are 
orthogonal  at  the  channel  output  with  respect  to  a  generalized  inner  product  is  equivalent  to  the 
statement  that  the  {<p^(t)}  are  orthogonal  at  the  output  of  the  "whitened”  channel. 

From  these  orthogonality  properties  it  follows  as  shown  in  Theorem  2,  that  if  x(t)'  is  written 
as 


x(t)  =  £  X.«p.(t) 

i 

then  y(t)  can  be  written  as 


y(t)  =  Z 

i 


where 

y.=<y,Kiei> 


(20) 


and 


n.n.  =  6.. 
i  J  ij 

Thus,  the  orthogonality  properties  of  the  {<p.(t)}  again  allow  the  channel  memory  (its  time- 
dispersive  characteristic  and  the  colored  noise)  to  be  treated  analytically  in  terms  of  a  number 
of  parallel  and  independent  channels  with  different  gains  as  shown  in  Fig.  3. 

Finally,  it  is  of  interest  to  interpret  the  inner  product  for  y^  in  terms  of  Fig.  4.  From 
Eq.  (20),  y.  is  given  by 


/**oo  ,~*oc 

yA  =  <y,  K1©.>=  j  y(t)  j  K1(t  -  t)  ©.(t)  drdt 
poo  pT  (p  .  (a)  p°° 

=  L  ^(t)  1  -V 1 


°  F 


h(r  —  a)  (t  —  t)  dr  da  dt 


pT  poo  poo  ^ 

=  -j=:  J  <?i(cO  j  y(t)  j  exp[ja;(a  -  t)]  dfdt  da  .  (21) 


Comparing  Eq.  (21)  with  the  filtering  operations  indicated  in  Fig.  4  shows  that  y.  is  precisely 
l/^xT  times  the  output  of  the  optimum  detector  when  <p.(t)  is  transmitted.  It  is  interesting  to 
note  that  this  is  identical  to  the  result  obtained  previously  for  the  {<p^(t)}  of  Theorem  1. 

In  conclusion,  it  should  be  mentioned  that  the  {<^(0}  of  Theorem  3  can  also  be  interpreted 
in  terms  of  the  appropriate  optimum  detector  in  a  manner  directly  analogous  to  the  preceding 
discussion.  In  this  case,  however,  the  derivation  of  the  (time-variant)  "prewhitening"  filter 
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and  the  remainder  of  the  optimum  detector  becomes  as  mathematically  involved  as  was  the  proof 
of  the  Theorem. 


5.  Some  Optimum  Signal  Properties  of  Basis  Functions 

The  functions  {<^(t)}have  some  additional  properties  that  are  of  interest  from  the  standpoint 
of  optimum  signal  design.  Consider  the  situation  in  which  one  of  two  time-limited  signals  is  to 
be  transmitted  through  the  channel  of  Fig.  1.  Let  these  signals  be  ±x(t).  It  is  desired  to  select 
the  signal  waveform  x(t)  so  that  the  probability  of  error  is  minimized  for  fixed  signal  energy 
when  an  optimum  detector  is  used.  Depending  upon  the  noise  spectral  density  and  the  receiver 
observation  interval,  the  following  results  are  obtained.^ 


a.  White  Noise,  Arbitrary  Observation  Interval  [0,T^] 

It  is  well  known  that  the  optimum  detector  (matched  filter)  for  this  situation  makes  a  deci¬ 
sion  based  upon  the  quantity 


y (t)  r  (t)  dt  =  (y,  hx)T 


=  y  * 


r 


where  r  =  [H]  x  denotes  the  usual  vector  inner  product,  and  the  basis  functions,  with  respect  to 
which  x,  y,  and  r  are  defined,  are  those  of  Theorem  1.  The  probability  of  error  for  this  de¬ 
tector  is  given  by 


P 

e 


i  r-N/E/N0  ,  ,  /7 .2, 

-  \  exp  [—  1/2  t  ]  dt 

N J  Zn  -°° 


where 


E  =  (hx,  hx)  =  r  •  r  =  Y 

1  .  , 
i=  1 

and  Nq  is  the  (double-sided)  noise  spectral  density.  Thus,  since  the  transmitted  signal  energy  is 

oo 

(x,  x)  =  X  •  x  =  x2 

i=  1 

and  since  by  convention  (Kef.  64)  ^  ^  X^  ^  .  .  .  ,  it  follows  that  for  fixed  input  signal  energy 

the  output  signal  energy  is  maximum  (and  therefore  the  probability  of  error  is  minimum)  when 
x (t )  =  (p  More  generally,  it  is  quite  easily  shown  from  these  results  that  x(t)  =  <pj(t)  is  the 

signal  giving  maximum  output  energy  on  the  interval  [ 0,  T  ^ ]  under  the  constraints 

(x,  (/?i)  =  0  i  =  1,  2,  .  .  . ,  j  -  1 

(x,  x)  =  1 


t  Note  that  these  results  assume  either  a  single  transmission  or  negligible  intersymbol  interference. 

$  Note  that  X.  =  (h<j>., h$.)T  /(<{>. /$.).  Thus,  X.  is  effectively  an  energy  "transfer  ratio"  and  $,(0  is  the  £o(0,T) 
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signal  having  the  largest  "transfer  ratio,"  i.e.,  it  is  "attenuated"  least  in  passing  through  the  filter.  This  property 
appears  to  have  been  first  recognized  by  Chalk^  for  the  special  case  T,  = 
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At  this  point,  two  additional  properties  of  the  {c^(t)}  of  Theorem  1  should  be  mentioned: 

(1)  When  =  00 ,  Eq.  (7)  reduces  to  the  well-known^0  integral  equation  in¬ 
volved  in  the  Karhunen-Loeve  expansion  of  a  random  process  with  auto¬ 
correlation  function  $n(t  —  s)  =  K(t,  s); 

(2)  When  K(t,  s)  is  defined  as 


K(t,  s)  4 


h(cr  —  t)  h(cr  —  s)  dcr 


0^:  t,  s<£  T 


0 


elsewhere 


and  h(t)  is  specialized  to  h(t)  =  (sin 7rt )/ 7rt,  the  {<pi(t)}  are  the  prolate 
spheroidal  wave  functions  studied  by  Slepian,  Poliak,  and  Landau  7 3 


The  next  section  demonstrates  that  ^(t)  of  Theorem  2  is  the  optimum  signal  for  binary 
transmission  when  the  noise  is  colored  and  a  doubly  infinite  observation  interval  is  used. 


b.  Colored  Noise,  Infinite  Observation  Interval 


The  optimum  detector  for  this  situation  is  known 
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to  make  a  decision  based  on  the  quantity 


where 


/"»0O 

\  y(t)  q(t)  dt 
'■‘Coo 

«<•>-  £2§f2'J"t<it 


Introducing  the  signal  representation  of  Theorem  2,  this  becomes 
J  y(t)  q(t)  dt  =  <y,  K1r>  =  y  •  r  =  y  •  [H]  x 
The  probability  of  error  for  this  case  is  determined  by  the  quantity 


|X(f)  H(f)|2 

N(f) 


df  =  ^r,  K^r^>  =  r  •  r 


E 

i=  1 


X.x. 
1  1 


which  might  aptly  be  called  the  "generalized  E/No"  Thus,  since  the  input  energy  is  (x,  x)  =  x  •  x, 
it  follows  that  again  x(t)  =  <p^(t)  is  the  optimum  signal  to  be  used  to  minimize  probability  of  error 
for  fixed  input  signal  energy. 

In  conclusion,  the  following  section  demonstrates  that  <p^(t)  of  Theorem  3  is  the  optimum 
signal  for  binary  transmission  when  the  noise  is  colored  and  a  finite  observation  interval  [0,  T^] 
is  used. 


c.  Colored  Noise,  Finite  Observation  Interval 


For  this  situation,  the  optimum  detector  is  known 
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to  make  a  decision  based  on  the  quantity 


dt 


where 
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with 


and 


00  (r,  yj)T 

q(t)  4  Z  - b - 1  ri(t)  0  <  t  <  T1 

i=l  1 

Wt)aVi(t)  °^t^T1 


Kn(t  -  s)  £ 


N(f)  exp[jo)(t  —  s)]  df 


0^t,s^T1 


0 


elsewhere 


In  terms  of  the  signal  representation  of  Theorem  3,  this  becomes 
-T  rTM 

4> 


C  1  y(t)  q(t)  dt  =  f  1  y(t)  I  Z 

Jo  Jo  li=1 


00  (r,  yj)T 


1 


fit  ri(t> 


dt 


^  y^s)  y.(t)l 
L  p~ 


i=  1 


ds 


dt 


'  (y.Kir)T^  Z  yi  '/^jxj(0i*  Kl0j)T 

i.j 


■  Zy^x^y-  [Blx-y-  r  . 


The  probability  of  error  is  again  determined  by  the  " generalized  E/No,"  now  given  by 


T  00 

C  r(t)  q(t)  dt  =  (r,  K  r)_  =  r  •  r  =  Z  *x;2 

0  1  i=l 


Thus,  since  the  input  signal  energy  is  (x,  x)  =  x  •  _x,  it  follows  that  again  x(t)  =  <p^(t) 
optimum  signal  to  use  to  minimize  probability  of  error  for  fixed  input  signal  energy. 


is  the 
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CHAPTER  ID 

ERROR  BOUNDS  AND  SIGNAL  DESIGN  FOR  DIGITAL  COMMUNICATION 
OVER  FIXED  TIME -CONTINUOUS  CHANNELS  WITH  MEMORY 


In  Sec.  I-E,  a  model  was  introduced  which  involves  the  transmission  of  one  of  M  signals  of 
T  seconds  duration  through  the  channel  of  Fig.  1  and  observation  of  the  channel  output  for  T^  sec¬ 
onds.  Subsequently,  it  was  shown  that  a  suitable  matrix  representation  for  this  problem  is 

y  =  [H]  x  +  n  ( 22) 

where  x,  y,  and  n  are  the  (column)  vectors  representing  the  channel  input,  output,  and  additive 

noise,  respectively,  [H]  is  a  diagonal  matrix,  and  n  has  statistically  independent  and  identically 

distributed  components.  Using  this  representation  a$d  a  slight  generalization  of  recent  results 
74  75 

of  Gallager,  '  this  chapter  will  present  the  derivation  of  a  bound  on  probability  of  error  and 

76  77 

provide  some  information  on  optimum  signal  design.  * 

A.  VECTOR  DIMENSIONALITY  PROBLEM 

Before  proceeding  with  the  derivation  of  an  error  bound,  it  is  necessary  to  consider  in  de¬ 
tail  the  dimensionality  of  the  vectors  involved.  In  deriving  the  representation  of  Eq.  (22),  it 
was  shown  that  the  basis  functions  used  in  defining  x  are  complete  in  the  space  of  all  £^(0,  T) 

signals,  that  is,  in  the  space  of  all  finite-energy  signals  defined  on  the  interval  [0,  T],  Since  it 

64 

is  well  known  that  this  space  is  infinite -dimensional,  it  follows  that,  in  general,  the  vectors, 
as  well  as  the  matrix,  of  Eq.  (22)  must  be  infinite-dimensional.  In  many  cases,  this  infinite 
dimensionality  is  of  no  concern  and  mathematical  operations  can  be  performed  in  the  usual  man¬ 
ner.  However,  an  attempt  to  define  a  "density  function"  for  an  infinite-dimensional  random 

vector  leads  to  conceptual  as  well  as  mathematical  difficulties.  Consequently,  problems  in 
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which  this  situation  arises  are  usually  approached  by  assuming  initially  that  all  vectors  are 
finite-dimensional.  The  analysis  is  then  performed  and  an  attempt  is  made  to  show  that  a  lim¬ 
iting  form  of  the  answer  is  obtained  as  the  dimensionality  becomes  infinite.  If  such  a  limiting 
result  exists,  it  is  asserted  to  be  the  desired  solution.  This  approach  is  used  in  the  following 
derivations  in  which  all  vectors  are  initially  assumed  to  be  d-dimensional.  However,  for  this 
problem,  it  will  be  shown  that  for  minimum  probability  of  error  the  signal  vectors  should  be 
finite  dimensional.  Furthermore,  it  will  be  found  that  the  "optimum"  dimensionality  is  inde¬ 
pendent  of  d  (assuming  that  d  is  large)  and  thus  that  the  original  restriction  of  finite  d  involves 
no  loss  of  generality.  This  result  is  obtained  because  the  in  the  [H]  matrix  approach  zero 
for  large  "i"  and  gives  an  indication  of  the  useful  "dimensionality"  of  the  channel. 

B.  RANDOM  CODING  TECHNIQUE 

This  chapter  is  concerned  with  an  investigation  of  the  probability  of  error  for  digital 

communication  over  the  channel  of  Fig.  1.  Ideally,  the  first  step  in  this  analysis  would  be  the 

derivation  of  an  expression  for  the  probability  of  error  under  the  assumption  of  an  arbitrary 

set  of  M  signals.  This  expression  would  then  be  minimized  over  all  allowable  signals  to  find 

both  the  minimum  possible  probability  of  error  and  the  set  of  signals  that  achieve  this.  How- 
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ever,  it  will  be  found  here,  as  is  usually  the  case,  *  that  an  exact  expression  for  Pg  is  im¬ 
possible  to  evaluate  by  any  means  other  than  numerical  techniques.  Thus,  it  is  necessary  to 
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investigate  some  form  of  an  approximate  solution  to  this  problem.  The  approach  used  here 
involves  an  application  of  the  random  coding  technique  to  a  suitable  upper  bound  to  P^. 

Conceptually,  the  random  coding  technique  may  be  viewed  in  the  following  manner.  First, 
an  ensemble  of  codes*  is  constructed  by  selecting  each  code  word  of  each  code  independently 
and  at  random  according  to  a  probability  measure  p(x).  In  other  words,  for  each  code  word 

Pr  (x:| 4  <x4  <  <x2  <«2  +  d{2. . 4k  <  \  <  lk  +  d«k>  ••  •} 

=  p(|)  dijdlg. .  .d«k.  . .  =  p(0  d| 

and,  furthermore,  for  any  set  of  distinct  code  words,  say  x.,  .  .  . ,  x  , 

1  s 

p(xr  x2,  .  .  . ,  Xg)  =  p(xt)  p(x2)-  •  •  p(xs) 

Next,  the  probability  of  error  for  each  code  is  evaluated.  Finally,  these  values  are  averaged 
over  the  ensemble  of  codes  to  find  an  average  probability  of  error  Pe.  Viewed  in  this  manner, 
it  seems  that  the  random  coding  technique  only  makes  an  impossible  problem  even  more  difficult. 
However,  the  simple  expedient  of  reversing  the  order  of  ensemble  averaging  and  evaluation  of 
Pe,  when  coupled  with  a  suitable  bound  on  P^,  leads  to  a  relatively  simple  expression  for  Pe 
for  the  channel  of  Fig.  1. 

Accepting  this  statement,  it  is  still  not  clear  that  knowledge  of  Pe  is  either  meaningful  or 
useful.  For  example,  the  knowledge  of  an  average  probability  of  error  for  an  ensemble  of  codes 
is  quite  different  from  a  knowledge  of  Pg  for  a  single  code.  Furthermore,  it  is  not  a  priori  ob¬ 
vious  that  a  knowledge  of  Pg  will  provide  any  information  on  the  construction  of  good  codes. 
Finally,  it  is  not  clear  that  the  value  of  Pe  will  provide  any  indication  of  the  minimum  possible 
probability  of  error  for  a  code  since  the  ensemble  averaging  could  include  a  large  fraction  of 
codes  having  a  high  probability  of  error. 

Fortunately,  these  doubts  can  be  resolved  quite  readily.  The  very  fact  that  P0  is  an  aver¬ 
age  over  an  ensemble  of  codes  implies  that  there  must  be  at  least  a  fraction  l/A  (A  >  1)  of  the 
codes  in  the  ensemble  that  have  a  Pg  less  than  AP^.  Thus,  for  example,  the  construction  of  a 
code  by  choosing  each  code  word  independently  and  at  random  according  to  the  probability  meas¬ 
ure  p(x)  must  lead,  in  99  times  out  of  100,  to  a  code  having  a  Pg  not  greater  than  100  P  Al¬ 
though  this  procedure  certainly  does  not  yield  a  code  having  an  absolute  minimum  probability  of 
error,  it  is  at  present  the  only  general  approach  known  for  constructing  large  codes  that  have  a 

P  that  is  even  close  to  this  minimum* 
e 

Therefore,  the  only  problem  that  remains  to  establish  the  usefulness  of  the  random  coding 

technique  is  to  determine  the  accuracy  of  Pg  relative  to  the  minimum  possible  P  .  For  both 

discrete,  memoryless  channels  and  the  time-discrete  Gaussian  channel,  this  problem  has  been 
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studied  by  deriving  a  lower  bound  to  P^  by  means  of  the  "sphere  packing"  argument.  *  For 
these  channels  it  has  been  shown  that  the  two  exponents  in  the  bounds  to  Pe  and  P^  are  identical 
for  all  rates  between  a  rate  R^  and  capacity.  Furthermore,  for  the  Gaussian  channel  the  value 
of  Rc  is  less  than  the  rate  at  which  a  digital  communication  system  with  coding  would  be  operated. 

fFor  convenience,  the  terms  "codfe"  and  "code  word"  are  used  here  to  mean,  respectively,  "a  set  of  M  finite 
energy  signals  of  T  seconds  duration"  and  "one  of  a  set  of  M  finite  energy  signals  of  T  seconds  duration  "  In 
addition,  no  distinction  is  made  between  a  signal  x.(t)  and  the  vector  x.  representing  this  signal. 

$  The  validity  and  importance  of  this  approach  have  been  demonstrated  experimentally  with  the  SECO  machine 
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Thus,  for  this  channel,  the  random  coding  technique  provides  a  practical  solution  to  the  problem 
of  determining  probability  of  error. 

Since  the  determination  of  a  lower  bound  to  Pe  does  not  appear  practical  for  the  channel  of 

Fig.  1,  the  approach  used  here  has  been  to  specialize  the  P  expression  to  the  case  considered 

53  e 

by  Shannon  and  then  to  compare  this  result  to  his.  This  analysis  is  performed  in  Sec.  III-E 

and  indicates  that  the  bound  obtained  in  this  work  is  quite  accurate  at  rates  above  R  . 

c 

One  point  that  has  not  been  considered  is  the  question  of  how  the  probability  measure  p(x) 
should  be  selected.  Aside  from  noting  that  the  ensemble  of  codes  should  satisfy  an  energy  con¬ 
straint  of  the  form  of  Eq.  (1),  the  only  statement  that  can  be  made  is  that  a  mathematically  trac¬ 
table  function  should  be  chosen  which  "seems  reasonable"  on  the  basis  of  experience  and  intui¬ 
tion.  If  the  resulting  expression  for  Pe  can  be  shown  to  be  sufficiently  accurate,  the  problem 
is  solved.  Otherwise,  another  choice  would  be  tried.  As  a  practical  matter,  it  is  found  that 
for  the  Gaussian  channel  of  Fig.  1  the  logical  choice  of  a  Gaussian  p(x)  of  the  form 

p(x>  =  n  Pj(x.) 

i 

where  p^(x^)  is  a  one-dimensional  Gaussian  density  function,  leads  to  satisfactory  results. 


C.  RANDOM  CODING  BOUND 

This  section  will  derive  a  random  coding  bound  on  probability  of  error  for  digital  communi¬ 
cation  over  the  channel  of  Fig.  1.  Let  x^,  .  .  . ,  Xj^j  be  an  arbitrary  set  of  M  d-dimensional  code 
words  for  use  with  this  channel  and  assume  that  the  a  priori  probability  for  each  code  word  is 

l/M.  Let  the  probability  density  function  for  y  given  that  x.  was  transmitted  be  p(y|x)-  Then, 
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it  is  well  known  '  that  the  detector  which  minimizes  probability  of  error  decides  that  Xj  was 

transmitted  if  and  only  if 

p(y|xk>  ^  p(y |x,-)  for  all  k  =  1,  2,  .  .  . ,  M 


-  J 
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Thus,  if  a  set  of  M  characteristic  functions^'  are  defined  as 
0  [y:p(y|xi,)<c  p(y|x.)  for  all  k  =  1,  . . 


<p  j(y)  = 


i-^'i'-k-  '  *-2- '=3 

(y:p(y|xk)  >P<y|xj)  for  some  k  =  1 

it  follows  that  the  probability  of  error  for  this  detector  is  given  by 
M  M 

pe  =  m  E  f  *j<z> p(zl^) Z  pj(e)  • 


M] 

■  ..M] 


(23) 


r-i  * 


j  =  1 


This  expression,  while  valid  for  any  M  and  any  set  of  signals  (sj(t)},  is  mathematically  intrac¬ 
table  for  interesting  values  of  M.  Thus,  it  is  necessary  to  derive  a  bound  on  Pe  that  is  suffi¬ 
ciently  accurate  to  be  useful  and  yet  is  readily  evaluated.  The  random  coding  technique  dis¬ 
cussed  above  when  applied  to  a  suitable  upper  bound  to  Eq.  (23)  gives  such  a  result. 

An  obvious  inequality  is 
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p(y|xk)ivl+p 


y  f^l^l 
^  l  p(y  1^)  J 

k=l  J 


k 

k^j 


^0 


since  the  right-hand  side  is  always  non-negative  and  is  not  less  than  1  when  p(y|x.)  <  p(y|xu)  for 

-  J  -  k 

some  j  k.  Thus, 


Pj(e)^  1  p(yl*//l+p 


M 


E  p(zl^k)1/1+p 


k=l 

.k^j 


dy  . 


(24) 


Now,  let  each  code  word  be  chosen  independently  and  at  random  according  to  a  probability  meas¬ 
ure  p(x)  and  average  both  sides  of  Eq.  (24)  over  this  ensemble  of  codes.  Then 


Pj(e)iPeS<j  P<y|xj)l/1+P 


M 


E  pfyUk>1/i+p 


k=l 

-Mj 


<*y 


(25) 


where  the  bar  denotes  averaging  with  respect  to  the  ensemble  of  codes  and  the  independence  of 
the  selection  of  x^.  and  x^  has  been  used  for  the  average  under  the  integral.  Equation  (25)  can  be 
further  upper  bounded  by  noting  that  7P  zp  for  0  ^  p  ^  1  (Ref.  78).  Introducing  this  inequality 
into  Eq.  (25),  and  recalling  that  the  average  of  a  sum  of  random  variables  equals  the  sum  of  the 
individual  averages,  gives 


p  <MP\  p(y|x)lj/(1+p)(  P>dy 

e  v/  V"  “  — 


0^Ps<  i 


or 


where 


Pg  <  exp[— TE(R,  p)] 


E(R,  p)  =  EQ(p)  -pR 


0«P  <  1 


(26) 


yp)  =  In  P(y|x)1<1+P)  P(x)  dx 


1+P 


dy 


R 


In  M 


This  bound,  which  applies  to  any  channel  for  which  the  indicated  integrals  exist,  will  now  be 
specialized  to  the  channel  of  Fig.  1.  Recall  that  in  deriving  Eq.  (26)  it  was  assumed  that  the  set 
of  signals  {s^ (t)}  were  d -dimensional,  i.e.,  each  of  the  signals  could  be  written  as  a  linear  com¬ 
bination  of  d  basis  functions.  There  was,  however,  no  assumption  made  with  respect  to  which 
set  of  d  functions  should  be  used.  Thus,  if  I  denotes  the  set  of  integers  that  specify  the  basis 
functions  used  and  if  the  matrix  representations  of  Theorems  1,  2,  and  3  are  recalled,  it  follows 
that  for  the  channel  of  Fig.  1 
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1 


(27) 


p(y|x)  =  n  -pz:  exp[-|(y. -JXTxj)1 2]  . 
id  ^ 

Next,  let  the  probability  measure  p(x)  which  defines  the  random  ensemble  of  codes  be 

p(x)  =  El  1//2  exp[-*|  (x.2/^2)] 

id 


(28) 


There  are  two  reasons  for  this  choice  of  p(x): 

(1)  This  form  of  p(x)  results  in  a  mathematically  tractable  expression  for 
the  error  exponent  of  Eq.  (26). 

( 2)  When  the  resulting  exponent  is  specialized  to  the  time-discrete  case 
considered  by  Shannon53  it  is  within  a  few, percent  of  his  random  coding 
exponent  (see  Sec.  III-E).  Furthermore,  Shannon's  exponent  was  shown 
to  be  identical  to  the  exponent  in  a  lower  bound  to  probability  of  error 
over  a  range  of  rates  that  are  of  considerable  practical  interest. 

Finally,  assume  an  average  power  constraint  on  the  ensemble  of  codes  of  the  form 


(29) 


(30) 


where 

,  2  2  2  . 
g_  =  (a1 ,  a.  ,  .  .  . ,  ak,  .  .  .  ) 


Z  "f  =  ST  . 

id 

Substituting  Eqs.  (27)  and  (28)  into  Eq.  (26)  gives,  after  evaluation  of  the  integrals. 


E(R,p,£)  =  X  In 

id 


1  + 


X.<7. 

1  1 


2 


1  +  p 


—  pR 


For  fixed  R,  maximization  of  Eq.(30)  over  p,  £,  and  the  set  I  gives  the  desired  random-coding 

error  exponent.  For  convenience,  let  this  maximization  be  performed  in  the  order  I,  a,  p. 
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Maximization  over  the  set  I  is  easily  accomplished  by  recalling  that  the  X ^  are  by  assump¬ 
tion  ordered  so  that  X ^  ^X2  >X^.  .  .  .  Thus,  the  monotonic  property  of  In  x  for  x  ^  1  implies  that 
E(R,  p,  g)  is  maximized  over  the  set  I  by  choosing  I  =  { 1,  2,  .  .  . ,  d} . 

The  maximization  over  a  is  most  readily  accomplished  by  using  the  properties  of  convex 
79 

functions  defined  on  a  vector  space.  For  this  purpose,  the  following  definitions  and  a  theorem 
80 

of  Kuhn  and  Tucker  (in  present  notation)  are  presented: 

(1)  A  region  of  vector  space  is  defined  as  convex  if  for  any  two  vectors  a 
and  p  in  the  region  and  for  any  X,  X,$  1,  the  vector  Xa_  +  (1  —  X)£  is 
also”in  the  region. 

(2)  A  function  f(a)  whose  domain  is  a  convex  region  of  vector  space  is  de¬ 
fined  as  concave  if,  for  any  two  vectors  a  and  p  in  the  domain  of  f  and 
for  any  X,  0  <  X  <  1, 

Xf(oi )  +  (1  -x)  f(p)4f  [*£  +  (1  -x)  p]  . 

From  these  definitions  it  follows  that  the  region  of  Euclidean  d-space 
defined  by  the  vector  a,  with 
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— 


x.'1  l3-64-3098(o-b)1 


N(f ) 


|  H (f )  |2 


Fig.  5.  Concerning  interpretation  of  certain  parameters  in  error  exponent: 
(a)  Interpretation  of  By(p)  and  N;  (b)  Interpretation  of  B(p)  and  W. 


a2  >  0  and  Yj  ai  ~  TS 
i=l 

is  a  convex  region  of  vector  space,  that  In  x  is  a  concave  function  for 
x  ^.1,  that  a  sum  of  concave  functions  is  concave,  and  thus  that  E(R,  p,  a) 
is  a  concave  function  of  a. 


Theorem  4  (Kuhn  and  Tucker). 

Let  f( cr)  be  a  continuous  differentiable  concave  function  in  the  region  in  which  a_  satisfies 
^2  2 

S  a.  =  TS  and  cr.  >  0,  i  =  1,  2,  .  .  . ,  d.  Then  a  necessary  and  sufficient  condition  for  a  to  maxi- 
i=l  1  1 

mize  f  is 


af(cr) 


9"i 


A  for  all  i  with  equality  if  and  only  if  a .  ^  0 


(7-a 


where  A  is  a  constant  independent  of  i  whose  value  is  adjusted  to  satisfy  the  constraint 
d  2 

S  cr.  =  TS. 
i=l  1 

It  follows  that  the  a  maximizing  Eq.  (30)  must  satisfy 


9E(R,  p,  g)  _  p  V(1  +p) 


9ai 


1  +  (A.a/Vd  +  P) 


A 


an  i  =  1,  ....  d 


with  equality  if  and  only  if  a.  >0.  Thus, 


(1  +  p) 
.0 


Ibt(p)  x. 


N 


i  =  N  +  1,  .  .  . ,  d 


(31) 


where  N  is  defined  by 

and 

_ 1 _  a  P 


BT(p)  2TA(1  +  p) 


The  value  of  BT(p),  and  thus  N,  is  chosen  to  satisfy  the  constraint 


which  yields 


N 


E  "i  =TS 

i=l 


N 

ST  +  £  A.-1 


1  +  p 


i=l 


BT(p) 


N 


(32) 


A  convenient  method  for  interpreting  B-^p)  and  N  is  presented  in  Fig.  5(a).  This  is  simply  the 
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discrete  form  of  the  well-known  water-pouring  interpretation  discussed  by  Fano  and  others 
for  the  special  case  of  channel  capacity.  Substituting  Eqs.  (31)  and  (32)  into  Eq.  (30)  gives 
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(33) 


N 

E(R,p)  =  Yj  111 
i=l 


Xi 


B  ™(p) 


—  pR 


Maximization  over  p  is  accomplished  using  standard  techniques  of  differential  calculus  and 
leads  to  the  final  result 


R(l) <:  R<:  R(0) 


where 


and 


N 


X. 


r<p)  =  Tt  l  toB±) 


i=l 


T*p)  (1  +  p) 


(34) 


(35) 


N 


Et(R)  =  2T  L  ln  bt(1)  R 


i=l 


0<R^R(1) 


(36) 


A  bound  that  is  in  some  cases  more  useful,  and  in  all  cases  more  readily  evaluated,  can  be 
derived  by  considering  Eqs.  (34)  to  (36)  for  T  -•*  ».  It  is  shown  in  Appendix  B  that  the  resulting 
form  for  the  exponent  is 


E(p)  =(t+-^>2  fB(p)  Rc^R^C  (37) 

- £ — 7|b  (p)  0<:p<c:i  (38) 

(1  +  p>2  2 

R  0.<R<:RC  (39) 

where 

C  =  R(0) 

Rc  =  R(l) 

_s _ +  r  N(f) 

_J_  2<1+p)  Jw  |H(f)|2 

B(p)  "  W 

A  convenient  method  for  interpreting  the  significance  of  B(p)  and  W  is  illustrated  in 
Fig.  5(b).  Pertinent  properties  of  the  exponent  of  Eqs.  (37)  to  (39)  are  presented  in  Fig.  6. 


R(p)  = 

f  In  lB(f)  1  2 

)w  ln  N(f)  B(p) 

E(R)  = 

C  ln  lH(f)  | 2 
Jw  N(f)  B(l) 

D.  BOUND  FOR  "VERY  NOISY"  CHANNELS 

In  this  section,  an  asymptotic  form  for  the  bound  of  Eqs.  (37)  to  (39)  is  derived  for  the  con¬ 
dition  S  0.  Consider  first  thp  bound  for  0^:  p  1  and  recall  (Fig.  1)  that  N(f)  is  assumed  to 
be  normalized  so  that 


max 

f 


|H(f)|  2 

N(f) 


=  1 
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Fig.  6.  Random  coding  exponent 
for  channel  in  Fig.  1 . 


R  (not*) 


Thus,  for  S  "sufficiently  small,"  it  is  clear  from  the  water-pouring  interpretation  of  Fig.  5(b) 
that  except  for  pathological  H(f)  and  N(f),^ 


mm 

N(f) 


«  1 


for  f  c  W 


Introducing  this  approximation  into  the  expression  for  B(p)  gives 

1  .  S 


B(p)  » 


-  w  1  — 


1  + 


2W(  1  +  p) 


for  S  -  0 


2W(1  +  p) 

In  view  of  this,  Eq.  (37)  becomes  to  first  order  in  S 

E(p)  ^TT^2  I  Rc^R^C  . 

Introducing  the  same  approximations  into  Eq.  (38)  gives 


(40) 


R(p)  «  V 

J' 


wtaB  (p)df- 


R _ _  S 

2  2 


(l  +  p) 


*  C  bfl  -  l]  df - £ — 7  | 

Jw  (i  +  p)2  2 


_ £ _  S  = 


2(1  +  p)  (1  +p)2  2  2(1  +p)c 

Finally,  elimination  of  p  between  Eqs.(40)  and  (41)  gives 

I  2 


O^P^  1 


E(R)  «  C  [l  -J§]‘ 


R  <  R<  C 
c  ^  ^ 


(41) 


(42) 


where 


c  =  A 
^  2 


R  -  S  _  C 
c  8  “  4 


In  a  similar  manner,  it  is  found  that  Eq.  (39)  becomes  approximately 
E(R)«C[j-^J  0^R«Rc  . 


(43) 


tNote  that  due  to  the  normalization  of  N(f)  the  statement  that  S  is  "sufficiently  small"  is  equivalent  to  the 
statement  that  a  suitably  defined  signal -to-noise  ratio  is  "sufficiently  small." 
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The  result  of  Eqs.(42)  and  (43)  is  of  interest  for  several  reasons. 

(1)  It  is  independent  of  the  filter  and  noise  spectral  characteristics  and  is 
thus  a  ” universal”  bound  for  the  channel  of  Fig.  1. 

(2)  It  is  identical  to  the  exponent  in  the  bound  on  probability  of  error  for  the 
transmission  of  orthogonal  signals  over  a  white  Gaussian  channel.^1  This 
implies  that  for  "sufficiently  small”  signal  power  the  memory  of  the  chan¬ 
nel  in  Fig.  1  has  negligible  effect  on  probability  of  error. 

(3)  It  agrees  precisely  with  the  small  signal-to-noise  ratio  (SNR)  bound 
found  by  Shannon  for  the  band-limited  channel,  3  and  is  identical  except 
for  the  definition  of  C,  to  a  bound  found  by  Gallager7^  for  "very  noisy” 
discrete  memoryless  channels.  Thus,  the  bound  of  Eqs.  (42)  and  (43) 
can,  when  C  is  appropriately  defined,  be  considered  to  be  a  "universal” 
bound  for  "very  noisy”  time -invariant  channels. 


E.  IMPROVED  LOW-RATE  RANDOM  CODING  BOUND 


The  previous  sections  have  presented  a  random  coding  bound  on  probability  of  error  for  the 
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channel  of  Fig.  1.  As  is  usually  the  case  with  random  coding  bounds,  '  this  bound  can  be 

shown  to  be  quite  poor  under  the  conditions  of  low  rate  and  high  signal  power.t  In  fact,  it  is  not 

difficult  to  show  that  the  true  exponent  in  the  bound  on  the  smallest  attainable  probability  of  error 

differs  from  the  random  coding  exponent  by  an  arbitrarily  large  amount  as  the  rate  approaches 

zero  and  the  signal  power  approaches  infinity.  This  section  presents  an  improved  low-rate  ran- 
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dom  coding  bound  based  upon  a  slight  generalization  of  recent  work  by  Gallager  that  overcomes 
this  difficulty. 

Before  proceeding  with  the  derivation  of  an  improved  bound,  it  is  important  to  consider 
briefly  the  reason  for  the  inaccuracy  of  the  random  coding  bound.  As  indicated  previously,  the 
random  coding  bound  is,  in  principle,  obtained  in  the  following  manner.  First,  an  ensemble  of 
codes  is  constructed  by  selecting  each  code  word  of  each  code  independently  and  at  random  ac¬ 
cording  to  a  probability  measure  p(x).  Next,  the  probability  of  error  is  calculated  for  each  code. 
Finally,  these  values  of  are  averaged  to  obtain  P  .  Note,  however,  that  nothing  in  this  pro¬ 
cedure  precludes  the  possibility  that  a  small  fraction  of  the  codes  in  the  ensemble  may  have  a 
Pg  considerably  greater  than  that  for  the  remaining  codes.  Thus,  it  is  possible  that  P’e  could 

be  determined  almost  entirely  by  a  small  percentage  of  high  P  codes.  (This  is  simply  illus- 

e  -1 

trated  by  considering  a  hypothetical  situation  in  which  1  percent  of  the  codes  have  a  P  of  10 

-10  e 

while  the  remaining  99  percent  have  a  Pg  of  10  .)  An  improved  bound  is  derived  here  by  ex¬ 

purgating  the  high  probability  of  error  code  words  from  each  code  in  the  ensemble  used 
previously. 

Consider  the  channel  of  Fig.  1  and  let  x^,  .  .  . ,  Xjy[  be  a  set  of  code  words  for  use  with  this 

channel.  Then,  given  that  Xj  is  transmitted,  it  follows  from  Eq.  (24)  that 


1/2 


M 

l  p(y|xkl1/2 


k=l 


dy 


(44) 


t  Comments  analogous  to  those  in  the  immediately  preceding  footnote  also  apply  to  the  term  "high  signal  power." 
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or  equivalently 


M 


pj(e)  ^  £  Q(Xj,  xk> 
k=l 
k=£j 


(45) 


where 


Xj . ik)  =  §  p(y lij)1//2  p(y lxk)1//2  dz  • 


Q( 


From  these  equations  it  is  clear  that  Pj(e)  is  a  function  of  each  of  the  code  words  in  the  code. 
Thus,  for  a  random  ensemble  of  codes  in  which  each  code  word  is  chosen  with  a  probability 
measure  p(x),  Pj(e)  be  a  random  variable,  and  it  will  be  meaningful  to  discuss  the  probabil¬ 
ity  that  P.(e)  is  not  less  than  some  number  A,  that  is,  Pr  {P.(e)  >A}.  To  proceed  further,  a 
3  3  75 

simple  bound  on  this  probability  is  required.  Following  Gallager,  let  a  function  <pj(x^,  .  .  .  ,  x^) 

be  defined  as 


"  -M*  = 


a  pj(e>  >A 


[1 

3 

0  if  P.(e)  <  A  .  (46) 

Then,  with  a  bar  used  to  indicate  an  average  over  the  ensemble  of  codes,  it  follows  directly  that 

(47) 


Pr  (Pj(e)  ^  A}  -  <pj(X|»  .  .  .  * 

From  Eq.(45)  an  obvious  inequality  is 

M 

(P-(xv....xm)^A's  Yj  ik)£ 

k=l 

k^j 


0  <  S^:  1 


(48) 


since  the  right-hand  side  is  always  non-negative  (for  A  >  0)  and  is  not  less  than  1  when  P^fe)  >A 
and  0  <  s  1 .  Thus, 


M 


(49) 


Pr  (Pj(e)  >A}  ^  A"S  £  *k>S  0  <  s<?  1 

k=l 

k^j 

where,  due  to  the  statistical  independence  of  Xj  and  x^  over  the  ensemble  of  codes, 

Qix.,  xk)s  =  ^  ^  p(y|xj)li/2  p(y|xk)1;  2  dy  p<Xj)  ptek)  dx..  dxk  . 

In  this  form  it  is  clear  that  Q(x.,  x.  )S  is  independent  of  j  and  k  and  therefore  that  Eq.  (49)  re- 

3  K 


duces  to 


Pr  (P.(e)  >A}  (M  -  1)  A~s  Q (x.,xjs  0<s<l  .  (50) 

3  ~3 

At  this  point  it  is  convenient  to  choose  A  to  make  the  right-hand  side  of  Eq.  (50)  equal  to  l/2. 
Solving  for  the  value  of  A  gives 
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A  =  [2(M  —  l)]p  |  Q(Xj.xk)l/p 


P 


(50 


where  p  ^l/s.  Now,  let  all  code  words  in  the  ensemble  for  which  P^fe)  be  expurgated. 
Then  from  Eq.  (51)  all  remaining  code  words  satisfy 


P.(e)  <[2Mf 


Q(xj.xk) 


1/P 


P  ^  1 


(52) 


Furthermore,  since  Pr  {Pj(e)  ^  A}  ^  l/2,  it  follows  that  the  average  number  of  code  words  M' 
remaining  in  each  code  satisfies  M'  ^M/2.  Thus,  there  exists  at  least  one  code  in  the  expur¬ 
gated  set  containing  not  less  than  M/2  code  words  and  having  a  probability  of  error  for  each  code 
word  satisfying  Eq.  (52).  By  setting  exp[RT]  =  M/2,  it  follows  that  there  exists  a  code  for  which 


where 


Pe  <  4 P  exp  [— TEe(R,  p)] 


E®(R.p)  =  Eoe(p)  -pR 


p  >1 


(53) 


and 


E®(p)^-^  lnQ(xj(xk)l/p  • 

This  bound  will  now  be  applied  to  the  channel  of  Fig.  1.  As  before,  let 

p(y|x)  =  n  -^zexp[--|(yi-N/XTx.)2] 
id  ^ 


(54) 


let  p(x)  be  chosen  as 

p(x)  =  n  1 


i  e  I 


n/  2  7 r 


exp 


Wl 


(55) 


and  let  the  input  power  constraint  be  given  by 

Z  *i2  =  ST  • 

id 


(56) 


[This  form  for  p(x)  has  been  chosen  primarily  for  mathematical  expediency.  However,  results 
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obtained  by  Gallager  indicate  that  it  is  indeed  a  meaningful  choice  from  the  standpoint  of  maxi¬ 
mizing  the  resulting  exponent.]  Substituting  Eqs.  (55)  and  (56)  in  Eq.  (53)  yields,  after  the  inte¬ 
grals  are  evaluated. 


E02(p,  2)  -  Yj  ln 
id 


1  + 


Vi 


2  P 


p  ^  1 


(57) 


where 


A  /  w  L*  u  \ 

2=(<T1 . . V') 
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For  fixed  R  and  T,  maximization  of  E ^(p)  —  pR  over  p,  a,  and  the  set  I  gives  the  desired 
bound.  [In  the  maximization  over  p  it  is  assumed  that  RT  »  In 4.  This  allows  the  factor  of  4P 
in  Eq.  (53)  to  be  neglected  in  performing  the  maximization  and  involves  no  loss  of  generality 
since  RT  =  InM  and  large  values  of  M  are  of  interest.]  Comparison  of  Eqs.(53)  and  (57)  with 
Eq.  (30)  reveals  a  strong  similarity  in  the  analytical  expressions  for  the  two  bounds.  As  a  result, 
the  maximization  procedure  used  previously  can  be  applied  without  change  to  this  problem,  yield¬ 
ing  the  final  result 

Pg  <  4P  exp[-TE®(p)]  p>  1  .  (58) 


where 


E  ®(p)  £  |bt(2p  -  1) 


N 


R(p)  £  4f  Yj  In 


s  BT(2p  -  1) 


i=l 

ST 


BT(2p-l)  4 


bt(-)  = 


N 

f1  +  (•)]  +  ^  Xi 

N 


-1 


and  N  satisfies 

XN>BT(2p-1)>XN+l  * 

As  before,  a  bound  that  is  more  readily  evaluated  can  be  derived  by  considering  Eq.  (58)  for 
T  ».  The  result,  which  is  proved  in  Appendix  B,  is 


Pg  <  4P  exp[-TEe(p)]  p>  1 


(59) 


where 


E e(p)4  !  B(2p  -  1) 


In 


lH(f)|2 

N(f)  B(2p  —  1) 


B(2p  -  1) 


_ S _  +  f  N(f) 

1  A  JW  |H(f)|2 

B(-  )  =  W 

w£{+f:  »B(2p  -  1)}  . 

Figure  7  presents  the  pertinent  characteristics  of  this  bound  and  relates  it  to  the  previous 
random  coding  bound.  Note  from  Eq.  (59)  that  B(2p  —  1)  and  W  can  be  interpreted  in  terms  of 
the  water-pouring  picture  of  Fig.  5(b)  by  simply  replacing  S/[2(l  +  p)],  where  0  ^  p  ^  1  with 
S/4 p,  where  p  >  1. 

This  completes  the  derivation  of  the  error  bounds  for  the  channel  of  Fig.  1.  There  remains 
the  problem  of  investigating  the  accuracy  of  these  results  since  there  is  no  assurance  that  they 
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Fig.  7.  Error  exponent  for  channel  in  Fig.  1. 


Fig.  8.  Comparison  of  error  exponents. 
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will  be  sufficiently  accurate  to  be  useful  t  As  noted  in  Sec.  III-B,  the  only  practical  approach  to 
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this  problem  is  to  specialize  the  bounds  obtained  here  to  the  case  considered  by  Shannon  and 
then  compare  the  two  results.  This  is  most  readily  accomplished  by  considering  the  present 
bounds  in  the  form  of  Eqs.  (34)  to  (36)  and  Eq.  (58)  for  the  case 

1  i  =  1,  2,  .  .  . ,  2T  (T  an  integer) 

0  i  >  2T  . 

When  evaluated  these  become 


e<»>  =  <tV2!  [lt 

s  r1 

2(1  +p)J 

Rc  R  C 

R(P)  =  ln[l  +  2(1S+p)] 

—  nats/second 

O^p^l 

E(R)  =  In  [  1  +  |]  —  R 

0<  R<5  R 

N  1 

Ee(p)  =  |[i  + ^r1 

p 

R(p)  =  ln[l  +  -Jjj]  —  nats/second  p  >  1 

and  are  plotted  in  Fig.  8  for  S/2  =  256.  The  corresponding  bounds  of  Shannon  are  also  plotted  in 
Fig.  8  and  have  been  obtained  by  observing  that  the  relation  between  the  present  notation  and  that 
of  Shannon  is 

2T  =  n 
S/2  =  A2 


R  =  2R 

s 

It  should  be  noted  that  the  new  bound  is  tighter  than  Shannon’s  for  low  rates  although  somewhat 
weaker  for  rates  near  capacity.  Furthermore,  the  new  bound  is  quite  close  to  Shannon's  lower 
bound  exponent  over  a  range  of  rates  around  R^  that  are  of  considerable  practical  interest. 

In  order  to  obtain  additional  insight  into  the  form  of  the  error  bounds  for  different  channels, 
it  is  of  interest  to  evaluate  the  bounds  for  the  channel  defined  by 


H(f)  = 


1 


1  +  jf 
N(f)  =  1  . 

Substitution  of  these  expressions  into  Eqs.  (37)  to  (39)  and  Eq.  (59)  gives 

E(,)  =  (r^)2|{i  +  l4irf77i2/3>'‘ 

R(p>  =  2  {)  4(/+  ^l1^3  -tan~1[4(i3+p)l1/3}  nats/second 

E(R)  =  2  [(^)1/>3  -tan"1^)1/3]  -R 


Rc  ^  R  ^  C 


04P4i 
0  <  R  <  R 

v  r 


t  See  Sec.  III-B  for  a  discussion  of  this  problem  and  of  the  procedure  normally  used  to  investigate  such  bounds. 
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and 


Ee(p)  =  |  [1  +(||)2/3]‘1  P>  1 

R (p)  =  2  —  tan-1 nats/second 

Figure  9  presents  these  bounds  for  S  =  1C)3.  It  is  interesting  to  note  that  the  only  basic  dif¬ 
ference  between  the  curves  in  Figs.  8  and  9  is  that  the  ratio  Rc/C  is  significantly  smaller  for  the 
latter  curve. 


Fig.  9.  Error  exponents  for  channel  In  Fig.  1  with 
H(f)  =  [1  +  if]-1,  N(f)  =  1,  and  S  =  103. 


0  2  4  6  8  10  12  14  16 

R  (nats) 

A  fact  of  considerable  practical  importance  can  be  determined  from  the  above  examples. 

At  present,  the  highest  rate  at  which  practical  coding  devices  can  operate  is  Rcomp  (Ref.  82)  — 

the  rate  axis  intercept  of  the  straight-line  portion  of  the  unexpurgated  random  coding  bound* 

For  "noisy”  channels  [see  Eq.  (43)],  Rcomp  is  readily  found  to  be  C/2;  a  somewhat  disappointing 

result  in  view  of  a  natural  desire  to  signal  at  rates  approaching  C.  However,  Figs.  8  and  9 

demonstrate  that  the  situation  is  quite  different  for  moderately  large  signal  powers.  For  the 

band-limited  channel,  Rcomp  is  essentially  1  bit/second  less  than  capacity  for  all  S  ;>  10  #  and 

in  fact  R  /C  -*  i  for  S  For  the  single-pole  channel,  it  is  readily  shown  that  R  — 

comp'  &  k  ^  J  comp 

0.8  C  for  S  -+  <*>  with  this  relation  being  quite  accurate  for  S  >  10  .  Thus,  at  high  signal  powers 
it  is  possible  to  achieve  data  rates  quite  close  to  capacity  using  existing  coding  and  decoding 
techniques. 

In  concluding  this  section,  one  final  point  should  be  made  concerning  the  error  bounds.  As 

indicated  above,  practical  coding  devices  operate  at  rates  less  than  Rcomp-  However,  the  use 

of  such  devices  would  normally  imply  a  desire  to  use  the  channel  as  efficiently  as  possible,  i.e., 

to  use  a  rate  near  Rcomp-  Thus,  the  fact  that  the  unexpurgated  random  coding  bound  applies 

for  all  rates  above  R  —  (S/8)  B(l)  coupled  with  the  fact  that  R  >  R  implies  that  there  would 

c  comp  c 

seldom  be  any  practical  interest  in  the  expurgated  bound.  Furthermore,  it  is  clear  from  Fig.  7 
that  Rc  and  C  are  the  crucial  factors  in  determining  the  unexpurgated  bound;*  in  other  words, 
given  R^  and  C,  an  exponent  that  is  sufficiently  accurate  for  engineering  purposes  can  be  ob¬ 
tained  graphically  by  simply  using  a  French  curve  to  draw  the  exponent  between  R^  and  C. 


t  Note  from  Eqs.  (38)  and  (39)  that  for  any  channel  and  for  any  signal  power  S,  R  ($)  =  0(5/2),  i.e.,  the 
expression  for  Rcomp  Is  identical  to  that  for  C  with  S  replaced  by  S/2.  comp 


t  This,  of  course,  assumes  that  C  is  known  as  a  function  of  S  so  that  R 
Rcomp®  -  C(S/2). 


comp 


can  be  determined  from  the  relation 
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F.  OPTIMUM  SIGNAL  DESIGN  IMPLICATIONS  OF  CODING  BOUNDS 


This  section  is  concerned  with  presenting  an  answer  to  the  third  question  of  Sec.  I-E;  namely, 
how  should  the  signals  that  are  to  be  transmitted  through  the  channel  of  Fig.  1  be  constructed  so 
as  to  minimize  probability  of  error.  The  need  for  such  an  answer  is  made  clear  when  the  wide 
variety  of  techniques  (such  as  AM/DSB,  AM/VSB,  PM,  differential  PM,  FM,  etc.)  presently 
used  for  telephone  line  data  transmission  are  considered. 

Initially,  it  might  appear  somewhat  surprising  to  expect  any  information  concerning  signal 

design  from  the  error  bounds  derived  above.  For  example,  the  derivation  of  error  bounds  for 
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the  white  Gaussian  channel  '  provides  no  insight  into  ’’good”  signaling  waveforms  and  in  fact 
it  makes  no  difference  —  the  signals  are  simply  linear  combinations  of  any  set  of  orthonormal 
functions.^  However,  for  the  channel  of  Fig.  1,  the  previous  analysis  shows  that  ’’good"  signals 
are  finite  linear  combinations  of  a  particular  set  of  orthonormal  functions,  namely,  those  of 
Theorems  1  to  3.  Furthermore,  the  use  of  any  other  set  of  orthonormal  functions  would,  in 
general,  lead  to  "good"  signals  that  were  infinite  linear  combinations  of  these  functions.  Thus, 
in  this  case,  the  error  bounds  do  indeed  provide  significant  insight  into  how  "good"  signals 
should  be  constructed. 

To  obtain  an  understanding  of  how  the  error  bounds  provide  an  answer  to  the  signal  design 
question,  it  is  necessary  to  reconsider  briefly  the  original  formulation  of  the  problem.  As  in¬ 
dicated  in  Sec.  I-E,  it  was  desired  to  formulate  the  problem  of  digital  communication  over  fixed 
time -continuous  channels  with  memory  in  such  a  way  that  the  subsequent  analysis  would  lead  to 
a  bound  on  the  best  possible  performance.  Thus,  no  practical  restrictions  were  introduced  with 
respect  to  the  form  of  the  signals;  it  was  simply  stated  that  the  channel  would  be  used  once  for 
T  seconds  to  transmit  a  signal  of  T  seconds  duration.  Since  T  was  completely  arbitrary,  this 
appeared  to  be  the  most  general  statement  possible  .t  Following  this  formulation  of  the  problem, 
basis  functions  were  found  for  use  in  the  vector  space  representation  of  the  signals.  Since  the 
basis  functions  were  shown  to  be  complete,  this  representation  introduced  no  restrictions  on 
the  form  of  possible  "good"  signals.  Next,  the  vector  space  signal  representation  was  used  with 
the  random  coding  technique  to  obtain  an  upper  bound  to  probability  of  error  for  a  random  en¬ 
semble  of  codes.  As  noted  in  Sec.  III-A,  this  derivation  initially  restricted  the  signals  to  be 
d-dimensional  where  d  was  finite  but  could  be  taken  to  be  arbitrarily  large.  However,  the  re¬ 
sult  of  optimizing  the  random  coding  bound  over  the  structure  of  the  ensemble  of  codes  showed 
that  only  a  lesser  number  N  of  the  input  signal  coordinates  should  be  used.  (See  Eq.  (31)  and 
the  preceding  maximization  over  a.]  Thus,  allowing  d  to  become  infinite  after  the  optimization 
procedure  removes  the  initial  finite  dimensionality  restriction.  This  leads  to  the  conclusion 
that  of  all  possible  structures  for  a  random  ensemble  of  codes,  the  best  is  the  one  in  which  each 
code  word  in  the  ensemble  is  a  finite  linear  combination  of  the  first  N  eigenfunctions.  Further¬ 
more,  this  result  demonstrates  that  "optimum"  digital  communication  over  the  channel  of  Fig.  1 
involves  use  of  the  channel  only  once  to  transmit  one  of  these  signals. 


t  Clearly,  practical  considerations  (e.  g.,  a  peak-power  limitation)  might  make  one  set  of  functions  more  desirable 
than  another. 

t  For  example,  this  statement  does  not  preclude  the  possibility  that  "good"  signals  might  be  linear  combinations 
of  time  translates  of  a  relatively  short  and  simple  signal,  it  just  does  not  assume  this. 
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In  conclusion,  two  points  should  be  made  concerning  this  ’’optimum”  signal  structure. 

(1)  In  retrospect,  it  is  almost  obvious  that  good  signals  should  be  of  this 
form.  As  noted  before,  the  eigenvalues  X^  are  effectively  energy  trans¬ 
fer  ratios.  Furthermore,  when  the  eigenvalues  are  ordered  so  that 

\\  >X^  ^*3  ^  .  .  . ,  it  is  known°4  that  —  0  for  i  -+  <*> .  Thus,  since  the 
input  signals  have  an  energy  constraint  and  since  the  channel  noise  is 
Gaussian,  it  would  intuitively  seem  most  unwise  to  place  a  large  amount 
of  signal  energy  on  an  eigenfunction  that  was  severely  ’’attenuated”  in 
passing  through  the  filter.  Clearly,  the  theoretical  analysis  confirms 
this  reasoning  and  in  addition,  provides  an  explicit  method  for  deter¬ 
mining  the  number  of  eigenfunctions  N  that  should  be  used. 

(2)  Although  the  previous  discussion  has  demonstrated  the  optimality  of  this 
signal  structure  from  a  theoretical  viewpoint,  it  is  clear  that  from  an 
engineering  standpoint  it  is  not  practical;  i.e.,  the  generation  of  a  large 
number  of  eigenfunctions  lasting  for  days  or  weeks  is  simply  not  feasible. 
Thus,  it  becomes  important  to  investigate  other  more  practical  forms  of 
basis  functions  in  an  attempt  to  find  signals  that  can  be  readily  generated 
in  practice  and  are  also  nearly  as  ’’good”  as  the  optimum  signals.  This 
problem  is  considered  in  Chapter  IV. 

G.  DIMENSIONALITY  OF  COMMUNICATION  CHANNEL 


The  dimensionality  of  a  channel  is  a  concept  of  interest  to  communication  engineers.  This 
concept  is  frequently  discussed  in  terms  of  the  noiseless,  band-limited  channel  defined  by 


H(f)  = 


1 

0 


IfUw 

elsewhere 


N(f)  =  0 


For  this  channel,  the  dimensionality  is  usually  considered  to  be  the  number  of  linearly  independ¬ 
ent  signals  that  can  be  transmitted  in  T  seconds  and  recovered  without  mutual  interference.  By 
an  argument  based  on  the  sampling  representation  for  band-limited  signals  it  is  concluded  that 
the  channel  dimensionality  is  2TW,  since  use  of  (sint)/t  signals  allows  transmission  and  recov¬ 
ery  of  2W  independent  signals  per  second.  There  are,  however,  several  fundamental  criticisms 
of  this  approach. 

(1)  Since  (sint)/t  signals  are  not  time-limited,  the  statement  that  2TW  sig¬ 
nals  can  be  transmitted  in  T  seconds  involves  an  (arbitrary)  approxi¬ 
mation. 

(2)  It  is  not  clear  how  this  approach  should  be  used  to  define  the  dimension¬ 
ality  of  a  band-limited  but  nonrectangular  channel  or  of  a  non-band- 
limited  channel.  For  example,  if  H(f)  =  [1  +  jco ]  ” 4 ,  how  is  W  to  be 
defined?  Conversely,  if  the  channel  is  band-limited  but  nonrectangular, 
its  impulse  response  is,  in  general, 

_  V  sin  7r(  2TW  -  i) 
h(t)  “  L  hi  “C2TW-'T]~  ‘ 

i 

Thus,  transmission  of  (sint)/t  signals  leads  to  received  signals  having 
mutual  (or  intersymbol)  interference. 

(3)  In  view  of  Theorem  1,  an  alternate  and  considerably  more  general  defini¬ 
tion  of  channel  dimensionality  is  simply  the  number  of  orthonormal  sig¬ 
nals  of  T  seconds  duration  that  can  be  obtained  which  remain  orthogonal 
over  some  interval  at  the  channel  output.  This  definition  is  more  appeal¬ 
ing,  since  it  applies  to  any  channel  and  since  the  intersymbol  interference 
is  zero  between  all  output  signals.  However,  it  was  shown  in  Theorem  1 
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that  the  set  of  </>^(t)  having  this  property  are  complete^  and  therefore 
infinite  in  number.  Thus,  in  contrast  to  the  finite  dimensionality  of  the 
original  approach,  this  definition  indicates  that  all  channels  are  infinite 
dimensional.  Clearly,  this  represents  no  improvement  over  the  pre¬ 
vious  definition. 


Before  presenting  a  definition  of  dimensionality  that  overcomes  these  problems,  it  is  im¬ 
portant  to  consider  in  greater  detail  the  infinite  dimensional  result  just  obtained.  As  indicated 
previously,  the  eigenvalues  X.  corresponding  to  each  q>  A, t),  are  the  energy  "transfer  ratio"  of 
the  filter  for  that  eigenfunction,  i.e., 

(h’’i’h^,i)T1  /oTl  [/oT^.(T)h(t-r)dT]2dt 

Xi=  =  /0T  dt 

64 

Furthermore,  when  the  X.  are  ordered  so  that  >  X^  ^  X^  >  .  .  . ,  it  is  known  that  X.  0  for 
i  °°.  Assuming  normalized  ^^(t),  this  implies  that  when  (p ^(t)  is  transmitted  the  output  signal 
energy  approaches  zero  for  large  "i."  Thus,  since  any  physical  situation  involves  measurement 
inaccuracies  (noise),  it  is  intuitively  clear  that  the  useful  dimensionality  of  a  channel  is  indeed 
finite,  i.e.,  all  the  <p.( t)  whose  eigenvalues  are  "too  small"  are  unimportant  in  determining  the 
channel  dimensionality. 

From  this  discussion  it  is  clear  that  a  useful  definition  of  the  dimensionality  of  a  communi¬ 
cation  channel  must  include  a  meaningful  definition  of  "too  small."  The  following  definition  of 
dimensionality  which  is  based  on  the  optimum  signal  results  of  Sec.III-F  satisfies  this 
requirement. 

Let{xj(t)}  be  a  set  of  signals  for  use  with  the  channel  of  Fig.  1  and  let  the 
{x^(t)}Jbe  selected  in  the  optimum  manner  of  Sec.  III-F,  i.e.,  each  xj(t)  is 
oFthe  form 


N 

x.(t)  =  Yj  *^(0  O^t^T 

i=l 


where  N  is  determined  by  Eq.  (31).  Then  the  dimensionality  D  of  the 
channel  is  defined  as^ 


D  = 


Note  that  this  is  effectively  a  "dimensionality  per  second"  definition  as  opposed  to  the  "dimen¬ 
sionality  per  T  seconds"  definition  considered  previously.  This  normalized  definition  is  used 
so  that  a  finite  number  will  be  obtained  for  the  dimensionality  in  the  limit  T  -*>  «>. 

In  concluding  this  section,  several  points  concerning  this  definition  should  be  mentioned. 

(1)  From  Appendix  B  it  follows  that  D  =  2W,  where  W  is  defined  by  Eq.  (39) 
and  the  water  pouring  interpretation  of  Fig.  5(b). 


t  Although  the  completeness  proof  for  Theorem  1  does  not  apply  to  the  rectangular  band-limited  channel 
(.C,  W0|d.  =  °o)  ;t  is  possible  to  show  that  the  <J>j(t)  are  also  complete  for  this  case. 

$  Recall  that  N  is  the  dimensionality  (Sec.  Il-C)  of  the  set  of  transmitted  signals  and  that  for  T  -►  °°  and  p  -*0, 
R  -*■  C.  Thus,  the  channel  dimensionality  is  defined  as  the  limiting  (normalized)  dimensionality  of  the  set  of 
signals  that  achieve  a  rate  arbitrarily  close  to  capacity. 


55 


(2)  The  presence  of  noise  is  considered  simply  and  directly  in  the  evaluation 
of  W  and  thus  of  D. 

(3)  It  is  satisfying  to  note  that  application  of  this  definition  to  the  band- 
limited  channel  defined  by 


H(f)  = 


|f|  w 

elsewhere 


N(f)  =  Nq  >  0 


gives  D  =  2W. 

(4)  This  definition  applies  to  any  channel  whether  or  not  it  is  band-limited 
and  flat. 

(5)  Calculations  indicate  that  N  =*  TD  =  2TW  [where  N  is  defined  by  Eq.  (31) 
and  W  is  defined  as  in  (1)]  when  T  is  only  moderately  large.  For  ex¬ 
ample,  Slepian7^  has  shown  that,  for  the  band-limited  channel  defined 
in  (3),  the  error  in  this  relation  (N  —  2TW)  is  not  greater  than  unity  for 
2TW  >  2.  Furthermore,  at  the  other  extreme  calculations  by  the  author 
for  the  channel 

H(f)  =  (1  +  jf)'1 

N(f)  =  Nq 

S/2N  =  102 
o 


have  shown  that  for  T  >  1  second  the  error  is  again  not  greater  than 
unity.  Thus,  this  definition  of  dimensionality,  although  defined  as  a 
limit  for  T  —  <*>,  gives  meaningful  information  for  practical  values 
of  T. 
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CHAPTER  IV 

STUDY  OF  SUBOPTIMUM  MODULATION  TECHNIQUES 


It  has  been  shown  that  "optimum"  digital  communication  over  the  channel  of  Fig.  1  involves 
use  of  the  channel  once  for  T  seconds  to  transmit  one  of  M  signals.  It  has  been  demonstrated 
that  "good"  signals  should  be  constructed  as  finite  linear  combinations  of  the  first  N  eigenfunc¬ 
tions,  i.e.,  each  x^(t)  is  of  the  form 

N 

Xj(t)  =  £  x.j<p.(t)  0<t<T  (60) 

i=l 

and  that  the  optimum  detector  makes  a  decision  based  on  the  N  numbers  y.,  where t 

yi  =  (y*0i)co  *  <61> 

It  should  be  recalled,  however,  that  there  has  been  no  claim  that  this  approach  is  in  any  sense 
practical.  In  fact,  since  T  might  be  on  the  order  of  hours,  days,  or  weeks  and  since  the  gener¬ 
ation  of  a  large  number  of  different  functions  [the  {<p-(t)}]  of  this  duration  is  not  feasible,  it  is 
clear  that  it  is  not.  The  purpose  of  this  section  is  to  investigate  some  "suitable  substitutes"  for 
the  (<p-(t)}  and  to  compare  the  resulting  error  exponents  to  that  obtained  when  the  {<^(t)}  are  used. 
In  this  manner  it  will  be  possible  to  make  an  engineering  evaluation  of  the  trade-off  between 
equipment  complexity  and  performance. 

In  the  "optimum"  approach  to  digital  communication  over  the  channel  of  Fig.  1  there  are  two 
basic  operations: 

The  selection  and  generation  of  the  transmitted  signal  x^(t); 

The  receiver  decision  based  on  the  channel  output  y(t). 

However,  when  considering  an  implementation  of  this  approach  it  is  convenient  to  break  the 

8 1 

problem  into  two  different  classifications,  commonly  called  coding  and  modulation: 

Under  modulation  is  included  the  problem  of  generating  the 
(vM)}.  or  suitable  substitutes,  and  the  problem  of  determining 
the  y.  of  Eq.  (61). 

Under  coding  is  included  the  problem  of  selecting  the  coefficients 
x^j  and  the  problem  of  making  a  decision  based  on  the  numbers  y^. 

Although  this  breakdown  is  convenient  and  widely  used  in  practice,  it  must  be  emphasized  that 
the  basic  operations  are  the  two  indicated  previously.  Thus,  in  the  design  of  an  efficient  com¬ 
munication  system,  coding  and  modulation  must  be  considered  together  and  possible  trade-offs 
between  the  two  evaluated.  In  practice,  this  might  be  accomplished  by  evaluating  the  cost  and 

performance  of  various  combinations  of  several  coding  and  modulation  techniques. 

8  2—84 

In  view  of  the  existence  of  several  practical  coding  techniques  and  the  lack  of  signifi¬ 

cant  previous  studies  of  modulation  for  the  channel  of  Fig.  1,  the  remainder  of  this  report  is 


f  For  simplicity  of  presentation,  the  following  discussion  assumes  white  noise  and  an  infinite  observation  interval. 
However,  unless  otherwise  indicated,  the  discussion  is  valid,  with  obvious  modifications,  for  any  noise  spectrum 
and  any  observation  interval  by  simply  introducing  the  appropriate  {<|>. (t) }  and  inner  product  for  y.  from  Theorems 
1  to  3.  '  ' 
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concerned  with  a  theoretical  investigation  of  several  modulation  techniques  and  an  experimental 
investigation  of  one  that  appears  quite  promising  for  the  telephone  channel. 

A  basic  problem  in  designing  a  modulation  system  is  the  determination  of  "suitable  substi¬ 
tutes"  for  the  {</? i(t)}  of  Eq.  (60);  it  simply  is  not  practical  to  think  of  constructing  a  large  number 
of  signals  that  may  last  for  hours,  days,  or  weeks.  Thus,  in  practice  it  is  necessary  to  replace 
the  (<p^(t)}  by  a  set  of  functions  that  are  time  translates  of  a  small  set  of  relatively  simple  signals. 
When  this  is  done  the  phenomena  called  intersymbol  interference  appears  due  to  the  time- 
dispersive  nature  of  the  channel.  As  a  result,  the  modulation  problem  reduces  to  a  study  of 
various  techniques  for  overcoming  intersymbol  interference.  The  following  sections  consider 
several  such  techniques. 

A.  SIGNAL  DESIGN  TO  ELIMINATE  INTERSYMBOL  INTERFERENCE 

This  section  considers  the  possibility  of  substituting  for  the  {<p.(t)}  a  set  of  time  translates 
of  a  single,  time-limited  signal  that  has  been  designed  to  eliminate  intersymbol  interference. 
Although  this  problem  can  be  formulated  and  a  formal  solution  presented  for  an  arbitrary  ob¬ 
servation  interval,  it  is  practical  to  obtain  numerical  results  only  for  the  infinite  interval.  In 
view  of  this  and  the  fact  that  the  resulting  analysis  is  greatly  simplified,  an  infinite  observation 
interval  (T1  =  «)  is  assumed  at  the  outset.  For  simplicity,  it  is  also  assumed  that  the  noise  is 
white;  the  generalization  to  colored  noise  is  presented  in  Appendix  F. 

Consider  the  situation  in  which  a  time-limited  signal  x(t)  of  J  seconds  duration  is  transmitted 
through  the  channel  of  Fig.  l.t  Assume  that  the  channel  is  followed  by  a  matched  filter  whose 
output  is  sampled  at  t  =  kj,  k  =  0,  ±1,  ±2,  ....  Then  by  designing  x(t)  so  that  the  matched 
filter  output  is  zero  for  all  sampling  instants  except  t  =  0,  it  will  be  possible  to  transmit  a 
sequence  of  time  translates  (by  kj  seconds)  of  x(t)  without  incurring  intersymbol  interference. 

It  is  not  clear,  however,  that  this  approach  will  lead  to  performance  that  is  acceptable  relative 
to  the  optimum  performance  found  previously.  This  question  can  be  investigated  by  considering 
first  the  problem  of  choosing  a  fixed  energy  x(t)  to  eliminate  intersymbol  interference  and  in 
addition  give  maximum  energy  at  the  channel  output.  It  will  then  be  possible  to  compare  the 
error  exponent  for  the  best  possible  performance  of  this  suboptimum  technique  with  the  optimum 
exponent  found  previously. 

The  problem  of  choosing  x(t)  to  satisfy  the  conditions  indicated  above  can  be  solved  using 
standard  techniques  of  the  calculus  of  variations.*^*'88  From  Fig.  1  and  the  definition  of  a  matched 
filter  it  follows  that  when  x(t)  is  transmitted,  the  matched  filter  output  for  t  =  kj  is 


x(p)  h(t  —  kJ  —  p)  dp  dt 


where 


.OO 


h(t)  h(t  —  t)  dt 


.OO 


t  For  this  problem,  3"  will  be  on  the  order  of  the  reciprocal  of  the  channel  bandwidth. 
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and  that  the  energy  of  x(t)  at  the  channel  output  is  simply  the  matched  filter  output  for  k  =  0. 
Finally,  the  energy  of  x(t)  is  given  by 


(t)  dt 


Thus,  by  means  of  Lagrange  multipliers,  the  constrained  maximization  problem  requires 
maximization  of  the  functional. 


r3  |  rJ  N 

I  =  \  +  \  x( cr)  x(p)  R,  (<r-p)  dp  +  Yi 

Jo  Jo  .  . 

k=l 


X  ^  x(  a)  x(p)  Rh(tf  -  p  -  kj)  dp 


da 


(62) 


where  the  intersymbol  interference  constraints  have  been  applied  only  to  sampling  times  greater 
than  zero  since  the  output  of  a  matched  filter  is  an  even  function  of  time  about  t  =  0.  The  number 
of  successive  sampling  times  N  to  which  the  constraints  are  applied  is  arbitrary  at  this  point. 

The  maximization  of  I  is  accomplished  in  the  usual  manner  by  substituting  x(t)  =  y(t)  +  ef(t), 
where  y(t)  is  the  desired  solution,  and  setting 


dl 

de 


=  O 


€=0 


Appendix  C  shows  that  this  implies 


Xy(t)  =  C  y(r)  K(t  -  r)  dr 


0  <  t^  3 


where 


N 

K(t  —  s)  £  Rh(t  -  s)  +  Yj  0k[Rh(t  -  s  +  k3")  +  Rh(t  -  s  -  k3)] 
k=l 


(63) 


Since  K(t  —  s)  is  symmetric  and  S.^  it  known that  a  solution  to  this  equation  exists.  How- 
ever,  to  the  author's  knowledge,  there  are  no  results  in  the  theory  of  integral  equations  which 
insure  that  the  {/3^}  can  be  selected  to  satisfy  the  intersymbol  interference  constraints.  Under 
the  restriction  that  h(t)  is  the  impulse  response  of  a  lumped  parameter  system,  this  difficulty 
can  be  overcome  by  transforming  Eq.  (63)  into  a  differential  equation  with  boundary  conditions. 
(See  Tricomi^5  for  a  discussion  of  the  relation  between  boundary  value  differential  equations  and 
integral  equations.)  The  boundary  conditions  will  be  determined  first  by  investigating  the  prop¬ 
erties  that  any  signal  must  have  to  give  zero  intersymbol  interference  at  the  matched  filter 
output. 


t  When  all  the  are  finite,  this  follows  directly  from  Appendix  A. 


59 


By  definition,  the  impulse  response  of  a  lumped  parameter  system  can  be  written  as 


h(t)  = 


“  -s.t 

2j  a.e  t  >0 

i=l 

0  t  <  0 


(64) 


where  a^  and  s^  are,  in  general,  complex  constants  satisfying  certain  conditions  which  insure 
that  h(t)  is  real.  Thus,  for  t  >  T,  the  response  r(t)  to  an  input  x(t)  that  is  nonzero  only  on  the 
interval  [0,  T  ]  is 


where 


-s.t 


pj  -s. 

r(t)  =  \  x(o-)  h(t  —  cr)  d<7  =  a.X(— s.)  e  1 

11 


i=l 


pT  S.<7 

X(— s.)  ~  \  x(cr)  e  1  da 
1 


Similarly  the  matched  filter  output,  say  z(t),  for  t  is  given  by 


where 


r(o-)  r(t  +  a)  da 


-s.t  p°°  -s  a 

=  a.X(— s.)  e  \  r(a)  e  do- 

i=i  J° 

n  _g  ^ 

=  £  a.X.HiSj)  e  1 
i=l 


h(t)  e”St  dt 


and 

X.  4  X(s.)  X(— s.) 

Now,  assume  that  intersymbol  interference  is  zero  for  n  successive  sampling  instants,  i.e., 
z(kj)  =  0  for  k  =  1,2,...,  n.t  Then  the  {X.}  must  satisfy  the  n  homogeneous  equations 

n  -s.kT 

a^X^H(s^)  e  *  =0  k  =  1,  2,  .  .  .,  n 

i=l 


t  In  order  that  the  error  exponent  may  be  evaluated  for  these  signals  It  Is  necessary  that  Intersymbol  Interference 
be  zero  for  all  sampling  instants,  i.e.,  that  N  =  00  in  Eq.  (62).  However,  the  results  of  the  following  derivation 
are  independent  of  N,  provided  N  ^n,  and  as  demonstrated  later,  lead  to  signals  giving  zero  intersymbol  inter¬ 
ference  at  all  sampling  instants.  Thus,  for  convenience,  N  =  n  is  assumed  at  this  point. 
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However,  these  equations  can  be  satisfied  by  nonzero  X,,  if,  and  only  if,  the  determinant  of  the 
coefficients  of  the  {X^}  vanishes.  By  forming  this  determinant  and  factoring  out  common  terms, 
it  reduces  to 


1  ...  1  ...  1 


-siJ 


-s.J  2 

(e  1  ) 


-s.J  n-1 

(e  1  ) 


'SkJ 


-s.J  n-1 
(e  k  ) 


86 


-s  J  n-1 
(e  n  ) 


This  is  a  Vandermonde  determinant  and  is  known  to  be  equal  to 


rr  -siJ  -sk? 

n  (e  1  -e  k  )  . 


i  =  l 

k=l.  .  .  i-1 


Thus,  the  determinant  will  be  zero  if,  and  only  if,  for  some  i  ^  k 


,  .  Znl 
si =  sk  +  i  T 


where 


(65) 


j  =  nPT 

i  =  0,  ±1,  ±2,  .  .  . 

However,  this  relation  between  the  poles  of  H(s)  will  not  exist  in  general. t  Thus,  the  {X^}  must 

be  identically  zero  if  there  is  to  be  zero  intersymbol  interference  at  the  matched  filter  output. 

Since  X^  =  X(Sj)  X(—  s^),  it  follows  that  for  each  i  =  1,  2,  .  .  .,  n  either  X(s^)  =  0  or  X(— s.)  =  0.  [The 

condition  X(s^)  =  0  and  X(—  =  0  is  not  included  in  the  following  discussion  since  it  is  a  sufficient 

but  not  necessary  condition  for  obtaining  zero  intersymbol  interference.]  Although  not  obvious, 

this  restriction  on  x(t)  is  precisely  what  is  required  to  obtain  a  solution  to  Eq.  (63). 

Recall  that  x(t)  is  defined  to  be  nonzero  only  on  the  interval  [0,  J].  Therefore,  X(s)  is  an 

entire  function  which,  from  the  above  arguments,  has  zeros  at  either  s.  or  —  s.,  i  =  1,  .  .  .,  n. 

87  11 
Gerst  and  Diamond  have  shown  that  a  Laplace  transform  with  these  properties  can  be  written 


X(s)  =  Q(s)  V(s) 


(66) 


t  The  remainder  of  this  discussion  ignores  the  special  cases  in  which  Eq.  (65)  is  satisfied  since  these  are  of  limited 
practical  interest. 
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where 


n 

Q(s)  4  []  Is  -  z.)  (67) 

i=l 

and  z.  =  s.  or  —  s.  according  to  whether  X(s.)  =  0  or  X(— s.)  =  0,  respectively.  V(s)  is  the  Laplace 

ill  l  l 

transform  of  a  pulse  v(t)  that  has  a  continuous  (n  —  1)  derivative  and  satisfies  the  boundary 

conditions 

v(0)  =  v(T)  =0 


v(n-1){0)  =  v(n-1)(J)  =  0 


(68) 


where 


v(i)(t)  4  . 

dt 

Thus,  the  result  is  obtained  that  any  solution  to  Eq.  (63)  which  satisfies  the  intersymbol  inter¬ 
ference  constraints  must  be  of  the  form 

y(t)  =  Q^p)  v(t) 


where 


Qt(p)  ^  Q(s) 


s=p=d/dt 


and  v(t)  satisfies  the  integral  equation 


XQt(p)  v(t)  =  ^  [<^(p)  v( cr) ]  K(t  —  a)  dcr 


0  <<  t  ^  J 


(69) 


as  well  as  the  boundary  conditions  of  Eq.  (68).  This  result  provides  the  boundary  conditions 
required  to  obtain  a  solution  to  Eq.  (63).  The  corresponding  differential  equation  will  be  derived 
next. 

Observe  that  the  operator  Q^(p)  under  the  integral  sign  in  Eq.  (69)  involves  differentiation 
with  respect  to  a.  Thus,  as  shown  in  Appendix  D,  integration  by  parts  yields,  after  substituting 
the  boundary  conditions  of  Eq.  (68),t 

XQ^p)  v(t)  =  j  v(cr)  [Q^-p)  K(t  -  cr) ]  da  (70) 


or 

AQt(p)  v( t)  =  QJp)  C*  v(<r)  K(t  -  cr)  dcr 


(71) 


t  The  fact  that  K(t  —  a)  is  a  linear  combination  of  translates  of  a  lumped  parameter  autocorrelation  function 
implies  that  K(t)  has  derivatives  of  all  orders  when  impulses  and  their  derivatives  are  allowed. 
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where  the  last  step  follows  from  the  fact  that 


K(t-or)  =  (-1)1  -4  K(t-ff)  . 
dt  dcr 

Next,  observe  from  Eqs.  (64)  and  (67)  that  the  definition  of  Q(s)  implies  that 

A  N(s2)  .  .  N(s2) 

^  "  Q<s>  Q(-s) 

2  2  2 

where  N(s  )  and  D(s  )  are  polynomials  in  s  and  c  is  a  constant.  Thus,  applying  the  differential 
operator  Q^—  p)  to  both  sides  of  Eq.  (71)  gives 


XD(p2)  v(t)  =  D(p2)  y  v(a)  K(t  —  a)  da 


(72) 


where 


D(p2)  £  D(s2) 


s=p=d/ dt 


Since,  by  definition. 


K(t  —  a)  =  f  —  J  exp  [s(t  —  a) ) 
D(s  ) 


1  +  Yj  /?k(eskJ  +  e'skJ) 


k=l 


df  s  =  j27rf 


Eq.  (72)  implies  that 


XD(p2)  v(t)  =  D(p2)  f  v(a)  C  j-  exp  [s(t  —  a)  ] 

^o  -°°  D(s  ) 


It  l 


k=l 


df  da 


=  C  v(a)  f  N(s2)  exp  [s(t  -  a)] 

=  N(p2)  \  v(a)  C  exp[s(t  —  a)] 
J-oo 


n 


1+  £  ^k(eskT  +  e-skJ) 


k=l 

n 


k 

i  +  £  Ve°"J  + 


skT  ,  -skjv 


k=l 


df  da 


df  da 


=  N(p2) 


v(a) 


6(t  —  a)  +  Yj  -  cr  +  kT)  +  6(t  -  a  -  kJ) ) 

k=l 


da 


=  N(p*)  v(t)  0  <  t  <  T  .  (73) 

This  differential  equation  together  with  the  boundary  conditions  of  Eq.  (68)  and  the  relation 

y(t)  =  Q^p)  v(t)  (74) 


defines  the  solution  to  Eq.  (63)  which  satisfies  the  intersymbol  interference  constraints.  However, 

since  this  is  a  boundary  value  instead  of  an  initial  value  differential  equation,  there  is  no  assur- 

88  89 

ance  from  the  previous  work  that  a  solution  exists.  Brauer  recently  studied  this  problem  and 
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found  that  there  exists  a  countably  infinite  discrete  set  of  eigenvalues  {x^}  for  which  there  are 
corresponding  eigenfunctions  {v.(t)}  that  satisfy  both  the  differential  equation  and  the  boundary 
conditions.  Thus,  a  solution  to  the  integral  equation  of  Eq.  (63)  which  satisfies  the  intersymbol 
interference  constraints  exists  and  satisfies  Eqs.  (68),  (73),  and  (74), 

Before  proceeding  with  an  investigation  of  the  properties  of  the  (v.(t)}  and  the  corresponding 
two  points  should  be  noted.  First,  observe  that  the  only  property  of  the  polynomial  Q(s) 
used  in  deriving  Eq.  (73)  from  Eq.  (71)  was 

Q(s)  Q(— s)  =  cD(s2)  (75) 


Therefore,  it  is  possible  to  arbitrarily  choose  either  z.  =  s^  or  z.  =  —  s.,  for  i  =  1,  .  .  .,  n  in  Eq.  (67) 
without  changing  the  fact  that  resulting  {y^(t)}  are  solutions  of  Eq.  (63).  Thus,  there  are  2n  pos¬ 
sible  choices  for  Q(s)  each  of  which  leads  to  an  equally  valid  solution  to  the  original  maximization 
problem.  Second,  observe  that  Brauer's  result  states  that  there  are  an  infinite  number  of  solu¬ 
tions  to  Eqs.  (68)  and  (73).  Thus,  the  question  arises  as  to  which  of  these  solutions  is  the  one 
that  yields  maximum  output  signal  energy.  It  is  shown  below  that  if  the  eigenvalues  are  ordered 
so  that  X1  ^X^  ^X3  5-.  .  .,  then  y^t)  =  Qt(p)  v^t)  is  the  desired  optimum  signal. 

The  fact  that  the  {v^(t)}  satisfy  Eqs.  (68)  and  (73)  leads  to  several  interesting  and  useful 
properties  of  the  corresponding  channel  input  signals  {y^(t)}.  One  property  of  the  {y^(t)}  is  that 
they  are  orthogonal  and  may  be  assumed  normalized.  To  verify  this,  let  the  following  notation 
be  adopted. 


f(t)  g(t)  dt 


(76) 


and 

p«t)  ^  p(p)  «t) 

where  P(p)  =  P(s)  |  g-p-d/clt  anc*  a  P°lyn°mial-  Then  it  follows  that 

(yi'  rj) 3"  =  ^i'  Qvj)j  • 

Upon  integrating  by  parts  and  substituting  the  boundary  conditions  of  Eq.  (68)  this  becomes  ^ 

(y.-.rJq-  =  (v.,Q~Qv.)  =  c(v.,  Dv.)  (77) 

1  J  J  1  J  1  J 

where 

Q’(s)  A  Q(-s) 

and  the  last  step  follows  from  Eq.  (75).  Now,  from  Eq.  (73)  it  is  found  that 

X.(vj,  Dv^y  =  ( Vj,  Nv.)  j  (78) 

and  also  that 

X.(v.,  Dv.)cr  =  (v.,  Nv.)cr  (79) 

J  1  J  J  1  J  J 

However,  integration  by  parts  and  substitution  of  the  boundary  conditions  of  Eq.  (68)  shows  that^ 

t  See  Appendix  D  for  the  details  of  a  similar  integration  by  parts, 
t  Operators  satisfying  the  following  relations  are  said  to  be  self-adjoint. 
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(Vj.  Dv.)j  =  (v.,  Dv.)y 

and 

<vj-Nvi>y  =  • 

Thus,  it  follows  from  Eqs.  (78)  and  (79)  that 

(X .  —  X.)  (v.,  Dv.)y  =  0  (80) 

and  therefore  that  (v.,  Dv.)«r  =0  if  (X.  —  X.)  ^4  0.  From  Eq.  (77)  it  then  follows  that  aft  y.(t)  cor- 
1  J  J  J  1  .  1 

responding  to  nondegenerate  v^(t)  are  orthogonal.'  Finally,  since  Eq.  (73)  is  linear,  it  follows 

that  the  {y^(t)j  may  be  normalized;  a  convenient  normalization,  which  is  assumed  throughout  the 

remainder  of  this  work,  redefines  the  {y^(t)}  as 

^  y.(t)  =  Q^p)  v.(t)  (81) 


and  assumes  that. 


(Yi.Vi'y  =  1 


(82) 


A  second  property  of  the  (y^(t)}  is  that  they  are  "doubly  orthogonal"  after  passing  through 
the  channel  filter  in  the  sense  that  if  r.j(t)  is  the  filter  output  when  y^(t  —  j  J)  is  transmitted  then 

if  i  =  k  and  j  =  i 


(rij'  rw,«  = 


X. 

l 


0  otherwise  (83) 

In  other  words,  the  (y^(t)}  are  orthogonal  at  the  filter  output  and  in  addition,  nonequal  time  trans¬ 
lates  of  any  two  of  the  functions  are  orthogonal  at  the  filter  output  .t  This  result  can  be  verified 
by  noting  from  Eqs.  (76)  and  (81)  that 

/^*oO  n  J  /"*  3" 

(rij'  =  c  J  J  IQ(j(p)  vi(a)l  h(t  -  -  a)  da  J  [Qp(p)  vfe(p)]  h(t  -  1 ST  -  p)  dp  dt 

j  g- 

=  c  j*  ^  [Qff(p)  v^tr)]  (Qp(p)  vk<P>]  Rh[^-P  +  (j  ~t)5]  dcr dp 

Integration  by  parts  and  substitution  of  the  boundary  conditions  of  Eq.  (68)  show  that  this  becomes 
J  3” 

(rij'  rkf)«  =  c  ^  J  vi(tr)  vk(p)  ^Q(r(-p)  Q(r(p)Rh  ta-P  +  d<TdP 


or  from  Eq.  (75), 


(r.j,  ^  f  C  v.(a)  vk(p)  {D(p2)Rh  [cr  —  p  +  (j  -  t) J  ]}  do-dp 

J  *Jr\ 


t  It  can  be  shown  that  degenerate  eigenfunctions  are  of  a  finite  multiplicity  and  may  be  assumed  to  be 
orthogonal 

t  It  should  be  observed  that  the  orthogonality  of  time  translates  of  the  {yj(t)}  follows  directly  from  the  fact  that 
the  {y;(t)}  have  been  chosen  to  give  zero  intersymbol  interference.  In  other  words,  the  conditions  of  orthogo¬ 
nality  at  the  channel  filter  output  and  zero  intersymbol  interference  at  the  matched  filter  output  are  identical. 
Thus,  the  {y. (fr)}  give  zero  intersymbol  interference  at  all  sampling  instants. 
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However,  since 


R,(cr  —  p)  =  f  —  S9^  exp[s(c r  —  p) ]  df  s  =  j27rf 

n  J-°°  D(sZ) 


this  implies  that 


(rij'  rk*)«> 


=  ^  y  v.(tr)  vk(p)  ^D(p2)  exp  {s[<7  —  p  +  (j  —  i) fT ]}  df^  dcr  dp 

=  ^  ^  Vj((t)  vk(p)  ^N(p2)  J  exp {s[o-  —  p  +  (j  -  t)7  ]}  df^  dcrdp 
J  J 

=  J  J  v.(ct)  vfc(p)  {N(p2)  6  [<r-p  +  (j  -  f)J]}  dadp 


<vi’Nvk>J 

0 


if  J  =  i 
otherwise 


xj(vi-  if  j  = 1 


X. 

l 


otherwise 

if  i  =  k  and  j  =  l 

otherwise 


(84) 


where  the  last  two  lines  follow  from  Eqs.  (79)  to  (82).  Thus,  the  {y^t)}  are  "doubly  orthogonal" 
in  the  sense  stated. 

A  third  and  extremely  important  property  of  the  {y.(t)}  is  that  the  corresponding  eigenvalues 
{x^  are  the  ratio  of  the  output  to  input  signal  energy,  i.e.. 


X.  = 


(ri0-  ri0)«” 


i-  (yi,yi)J 

This  is  readily  verified  by  noting  from  Eqs.  (77),  (81),  and  (84)  that 

(ri0'  riQ)oo  .  <V  NviKf 

{yi’yi]3  ’  (Vj.DVjJy  -  i 


(85) 


Thus,  with  X^  ^X^  ^^3  ^  **  follows  that  y^(t)  is  the  solution  to  the  original  maximization 

problem. 

A  final  property  of  the  {y^(t)}  is  obtained  for  the  special  case  in  which  z .  =  —  s^  for  all  i  in 
Eq.  (67).  For  this  case  it  follows  directly  that 

y^t)  =  D+(p)  v^(t) 

where 

H(s)  4 

D+(s) 
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and  thus  that 

riQ(t)  =  N+(p)  v.(t) 

Since  N+(p)  v.(t)  is  a  linear  combination  of  derivatives  of  v.(t),  this  demonstrates  that  riQ(t)  is 

time  limited  to  [0,  J  ],  i.e.,  the  time-limited  input  y.(t)  yields  an  output  r^( t)  that  is  again  time 

limited  to  the  same  interval.t  This  specia1  form  of  the  solution  to  Eq.  (63)  has  been  studied  by 

90 

Hancock  and  Schwarzlander  and  is  of  practical  interest  due  to  the  possible  simplification  of 
matched  filter  construction  for  time-limited  signals. 

As  an  illustration  of  these  results,  consider  the  situation  in  which  the  channel  filter  is  given 

by 


Then 


^  4  H(s)  H(— s)  =  —U, 

D(s  )  1  —  s^ 

and  the  boundary  value  differential  equation  becomes 


-(-*) 


v.(t)  =  0 


o  <  t  <  y 


where 


and 


2  -  1  . 

"i  -x:-1 
1 


v^O)  =  v^(J)  =  0 


From  this  it  follows  readily  that 


v.(t)  = 
i 


a^  sinajjt 


0  t<:  J 

elsewhere 


where 


uk  =  iir/J  i  =  1,  2,  3,  .  .  . 

From  Eq.  (67)  and  the  discussion  following  Eq.  (75),  Q(s)  is 
Q(s)  =  1  ±  s 


tTo  the  author's  knowledge,  this  property  of  the  solutions  of  Eq.  (63)  was  first  recognized  by  Richters. 
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Thus,  the  eigenfunctions  and  eigenvalues  are 


2  -1 

X.  =  [1  +  cj  ] 

l  1  l  1 


i=  1,2,  ... 


and 


2A.ll/2 


**>  ■  m 


[sincj.t  ±  (j.  cos  cj.t] 


From  this  observe  that 


However, 


r4X.X  il/2  p3* 

\  y.(t)  y.(t)  dt  =  I — iyi  \  [sincj.t  sin  cj  .t  +  cj.cj .  cos  cj.t  cos  cj  .'t 

J0  1  J  l  3"^  J  Jo  1  J  1  J  1  J 

±  cj^  cos  cj  .t  sin  cj^t  ±  cjj  sin  cj^t  cos  cj^t]  dt 


cJ  J 

\  sin  cj  .t  sin  cj  t  dt  =  \  cos  cj  .t  cos  cj  t  dt  =  ■=■  6 . . 

1  J  Jr,  1  J  2  1J 


and  integration  by  parts  shows  that 


cj  •  \  cos  cj  .t  sin  cj -t  dt  =  —  cj  .  \  sin  cj  .t  cos  cj  t 
1  1  J  J  Ja  1  J 


dt 


Thus, 


rJ  r4\M1/2  j  2 

Jo  *i(t)  rjU)  dt  =  [-^j  2  f1  +  Wi  i 


6..  =  6.. 

ij  ij 


and  the  {y.(t)}  are  orthonormal.  Next,  consider 


poo 

r.j(t)  4  J  y.(a  -  jJ)  h(t  -  a)  dcr 


Evaluation  of  the  integral  shows  that 

0 


[2X 

pijw  =  xi  [  T 


2Xj  1 1  /  2 


J 


t  <  j7 


(1  ±  )  sinoj/t  —  j7)  —  w.(l  1)  {coscu/t  -  jj)  —exp  [-(t  —  j7)]} 

j7  4  t  <  (j  +  1)  7 

w/1  T  1)  exp  [—  (t  -  j7)]  [(-l)1"1  z  +  1]  t  >  (j  +  1)7 


Observe  from  this  that  if  the  +  sign  is  taken  in  the  definition  of  Q(s),  r.j(t)  has  the  simple  form 


rij(t) 


2A.1 1/2 


■ffl 


sin  w.(t  —  j7)  j7  t<:  (j  +  1)7 


elsewhere 


In  other  words,  the  time-limited  input  signal  y.  (t  —  j  3")  yields  an  output  that  is  again  time  limited 
to  the  same  interval.  Furthermore,  it  follows  almost  trivially  that 
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if  i  =  k  and  j  =  i 


0  otherwise 

i.e.,  for  this  Q(s)  the  (y^(t)}  are  "doubly  orthogonal"  at  the  filter  output/  Also,  note  that  since 
the  fy^(t)}  are  normalized,  X .  is  the  ratio  of  the  output  to  input  signal  energy. 

The  previous  work  has  derived  the  optimum  time-limited  signal  that  gives  zero  intersymbol 
interference  at  the  matched  filter  output.  It  is  now  of  interest  to  compare  the  performance  of 
this  modulation  technique  to  that  obtained  when  the  optimum  technique  of  Sec.  III-F  is  ifsed. 

Recall  that  optimum  signals  are  constructed  as  linear  combinations  of  the  first  N  eigenfunctions, 
i.e.,  each  signal  is  of  the  form 

N 

XjU)  =  Yi  0<t<T 

i=l  j  =  1 . M 

and  that  a  study  of  suboptimum  modulation  techniques  involves  finding  "suitable  substitutes"  for 
the  {<p^(t)}.  For  the  suboptimum  modulation  technique  of  this  section,  the  (<p.(t)}  are  replaced 
by  time  translates  of  the  optimum  signal  y^(t),  i.e.,  if  {<p^(t)}  are  the  functions  substituted  for 
the  {<p^(t)}  thent 

<p{(t)  4  Yjtt-iJ)  0<t<T 

i  =  1,  2 _ _  K  (86) 

and  a  general  input  signal  is  of  the  form 

K 

x(t)  =  Yi  x.<p.'(t)  0  <  t  <  T  (87) 

i=l 

For  this  input,  the  corresponding  channel  output  is 

K 

y(t)  =  Yj  xiei  ^ t>  +  (88) 

i=l 


where 


T 

©•'(t)^ — —  f  cpHr )  h(t-r)  dr 

nFi  J° 

and  X i  is  the  first  eigenvalue  of  Eq.  (73).  From  this  it  follows  that  the  matched  filter  output  at 
t  =  kJ,  z(kj),  is§ 


t  Verification  of  this  property  for  Q(s)  =  1  —  s  is  possible,  but  tedious  and  is  omitted. 

$  It  is  assumed  here  that  J  and  T  are  related  by  T  =  K  J  where  K  is  an  integer.  However,  K  =  N  is  not  assumed 
since  N  is  the  optimum  dimensionality  of  the  { x.(t)}  only  when  the  {<)>.(t)}  are  used. 

-1/2 

§  For  convenience  it  is  assumed  here  that  the  matched  filter  is  matched  to  Aj  '  times  the  channel  filter  output 
when  (t)  is  transmitted.  This,  together  with  the  assumed  noise  spectrum  normalization  (see  Fig.  1)  gives  unity 
noise  variance  at  the  matched  filter  output. 
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^»00 

z(kJ)  =  J  y(t)  e^t  -  kJ)  dt  =  j  y(t)  0^(t)  dt  =  (y,  0^)o 


K 


•  Z  xi(8i  -  ekl-  * <o-  V. 

i=l 

=  7*7  xk  +  nk 


(89) 


where  A  (n,  0^)^  and  tne  last  step  follows  from  Eq.  (84)  by  noting  that  0.'(t)  =  r^.ft). 
Furthermore, 


poo  p»00 

nZnj^  =  j  j  n(cr)  n(p)  0^(a)  0j(p)  dlTdP 

ioo  p»oo 

J  6(a  —  p)  0^(a)  ejf (p)  da  dp 


=  (e/,  e  »)  =  6,  . 

k'  j  kj 


(90) 


Thus,  by  comparing  Eqs.  (89)  and  (90)  to  the  vector  representations  of  Theorems  1  to  3,  it 
follows  that  the  value  of  C  for  this  modulation  technique,^  say  Cf,  can  be  obtained  from  Eqs.  (32) 
and  (35)  by  substituting 


i  =  1,  .  .  . ,  K 
otherwise 


and  performing  a  maximization  over  3  .  [The  prime  has  been  used  here  to  indicate  that  X ^  is  an 
eigenvalue  of  Eq.  (73).]  The  resulting  expression  is 


C' 


rr^ax  In  [1 


X^J)  SJ] 


(91) 


where  the  fact  that  X^  is  a  function  of  the  length  of  the  interval  over  which  y^ft)  is  defined  has 
been  emphasized  by  writing  X1  =  X^fJ).  It  should  be  mentioned  that  the  maximization  over  3 
is  required  since  the  value  of  3  in  Eq.  (86)  is  arbitrary  and  therefore  should  be  chosen  to  max¬ 
imize  Cf. 

From  Eq.  (91)  it  is  clear  that  a  knowledge  of  the  first  eigenvalue  of  Eq.  (73)  is  required  as 
a  function  of  the  interval  length  3  .  However,  boundary  value  equations  such  as  this  are  quite 
tedious  to  solve  for  specific  cases.  Thus,  since  only  the  eigenvalue  is  of  interest  and  not  the 
eigenfunction,  it  is  desirable  to  investigate  a  means  for  determining  the  eigenvalue  without 
solving  the  differential  equation.  Such  a  result  can  be  obtained  by  considering  the  Laplace 
transform  of  the  solution  to  Eqs.  (68)  and  (73)  repeated  below  for  convenience 


t  Recall  that  a  conclusion  of  Sec.  Ill-E  was  that  a  knowledge  of  Rc  and  C  as  a  function  of  signal  power  S  pro¬ 
vides  sufficient  information  about  the  error  bounds  for  engineering  purposes.  However,  in  a  study  of  suboptimum 
modulation  systems  which  involves  only  a  comparison  (and  not  a  numerical  evaluation)  of  the  error  exponent  for 
various  techniques,  it  is  convenient  to  make  a  further  simplification  and  to  evaluate  only  C*  as  a  function  of  S; 
the  tacit  assumption  being  that  if  C'  is  close  to  C,  then  is  close  to  Rc  and  conversely,  if  C*  differs  greatly 
from  C  then  R^  differs  greatly  from  R^. 
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0  <  t  <  3" 


(92) 


[N(p2)  -X.D(p2))  v.(t)  =  0 
v.(0)  =  v.(J)  =  0 


Vi(n_1)(0)  =  v.(n-1,(J)  =  0 


(93) 


Since  no  additional  effort  is  required  to  obtain  an  expression  which  defines  all  the  eigenvalues, 
this  will  be  done. 

Recall  that  by  assumption  the  {v.(t)}  are  zero  outside  the  interval  [0,  J]  and  that  the  poly- 

2  1  2 
nomial  D(s  )  is  of  order  2n.  Assume  that  the  order  of  N(s  )  is  2m  and  that  m  <  n.  Finally, 

observe  that  although  the  boundary  conditions  imply  that  all  derivatives  of  v.(t)  through  the 

st  1 

(n  —  1)  are  zero  at  t  =  0  and  t  =  T,  it  is  possible  that  higher  order  derivatives  of  v^(t)  will 

involve  impulses  and  their  derivatives  at  these  end  points.  In  view  of  these  conditions,  Eqs.  (92) 

and  (93)  can  be  replaced  by  the  equation 

|N(p2)  -  X.D(p2)]  v.(t)  =  P^lp)  6(t)  +  P2.(p)  5(t  -  J)  — «  <  t  <  «  (94) 


where  P ^(s)  and  P^(s)  are  polynomials  of  order  (n  —  1)  or  less  whose  coefficients  will  be  deter¬ 
mined  later.  Since  Eq.  (94)  is  satisfied  for  all  t,  the  Laplace  transform  of  both  sides  can  be 
taken.  This  yields 


V.(s) 


Pli(s)  +  P2i(s)  esJ 
N(s2)  -A.D(s2) 


(95) 


where  V^(s)  is  the  transform  of  v^(t)  and  the  domain  of  convergence  is  the  entire  finite  s-plane. 
It  is  shown  in  Appendix  E  that  all  nondegenerate  solutions  of  Eqs.  (92)  and  (93)  are  either  even 
or  odd  functions  about  t  =  T/2.  Thus,  since  an  even  or  odd  function  has  an  even  or  odd  trans¬ 
form,  it  follows  that  for  nondegenerate  eigenfunctions 


Pu(s)  e~s-f/Z  +  P2i(s)  es<S/Z 
N(s2)  -  A.D(s2) 


=  ± 


P^f-s)  es^/2  +  P2j(-s)  e'sir/2 
N(s2)  —  A.D(s2) 


or 

Pli(s)  *  P2i(-s)  =  *[pii<-8>  *  P2i(s))  &S‘S  ■  (96) 

s  y 

Expansion  of  e  shows  that  the  right-hand  side  of  Eq.  (96)  is  an  infinite  order  polynomial  while 

s  t 

the  left-hand  side  is  at  most  an  (n  —  1)  order  polynomial.  Thus, 

Pli(s)  ±  P2i(_s)  =  0 
and  Eq.  (95)  becomest 


t  Although  the  previous  discussion  has  shown  Eq.(97)  to  be  true  only  for  nondegenerate  eigenfunctions,  it  is 
possible  by  means  of  an  argument  identical  to  that  used  by  Youla92  to  show  that  it  is  true  for  all  eigenfunctions. 
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(97) 


V.(s) 


Pli(s)  ±Pli(~s) 

N(s2)  -XjDfs2) 


Next,  recall  that  V\(s)  must  be  an  entire  function  since  v.(t)  is  a  pulse.  Thus,  the  coefficients 

in  P .  .(s)  must  be  chosen  so  that  the  numerator  of  Eq.  (97)  contains  all  the  zeros  of  the  denominator. 

2  2 

If  ±s.,,  =ts.7,  .  .  . ,  ±s.  are  the  2n  roots  of  N(s  )  —  \  .D(s  )  =  0,  this  condition  leads  to  the  n 
ii  i  £  in  i 

homogeneous  equations 

S..J 

PH{sU]  ±  Pii<-Sii>  e  =0  i=1 . n  •  (98) 


Since 


Pli(s)=  2 

j  =  l 


Eq.  (98)  becomes 

*il(1  ±  vii'  1  “i2l  *  T  "ii'  °ii  '  “i3'  ‘  *  "ii'  “i i 


a,  1  < 1  ±  w;»)  +  ai7(i  *  ww)  s„  +  a.,(l  ±  w;,)  s  2  +  .  .  .  +  a-n  [1  ±  (-l)n_1  w.f]  s^"1  =  0 


i  =  1,  .  .  .,n  (99) 

where  ^  exp  [s^J).  Thus,  a  set  of  nonzero  coefficients  {a^}  exists  if  and  only  if 


A  = 


(1  ±  w^)  (1  t  w^)  s^ 


<1±wi2> 


(1±win)  (1*win)sin 


[1  ±  (-l)n_1  WjJ  (sil)n_1 


[1  ±  (-l)n_1  w.  )  (s.  )n_1 
l  inJ  in 


=  0  (100) 


Equation  (100)  is  the  desired  result  which  allows  evaluation  of  the  {x^}  without  explicit  solution 
of  the  boundary  value  differential  equation.  [In  addition,  observe  that  this  result  and  Eqs.  (97) 
and  (99)  provide  a  frequency  domain  solution  for  the  (w(t)}.] 

The  result  will  now  be  used  to  evaluate  the  performance  of  the  suboptimum  modulation  tech¬ 
nique  of  this  section.  A  convenient  class  of  channels  to  consider  are  those  having  Butterworth 
filter  characteristics,  i.e.,  channels  for  which 


H(s)  H(  — s)  = 


_ 1 _ 

i  +  (-Dn  <^>2n 


2  2 

For  this  class  of  filters,  the  roots  of  N(s  )  —  A^D(s  )  =  0  can  be  expressed  in  the  simple  form 


Bu  =  (27T)(\.  ‘  -  1]' 


I.'1  -lll/2n 


exp  [jir(^-_J-  +  i)] 


I  =  1,  .  .  .  ,  n 


Observe  that  if  is  written  as 
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1 


(101) 


where  is  a  constant,  then 

sii  =  Zn  T1  exp  (J7r(^H-5'  +  2^  t=i’  —  n 
and  after  factoring  out  common  terms,  Eq.  (100)  becomes 


A  = 


(liw^)  Htw.^ 


[1  ±(-l)n-1  wH] 


(1  ±w.n)  ^lTwin^  ^exP  Ii7r(n  —  1  )/n )}  *•*  [1±(—  l)n  1  Wjn)  {exp  [j7r(n— l)/n]}  n—  1 


=  0 


(102) 


where  now  w^  =  exp  {27rk  .  exp  [j7r(f  —  l/n)  +  (l/2)]}.  Thus,  for  the  Butterworth  filters,  A  is  in¬ 
dependent  of  3  when  the  are  expressed  as  in  Eq.  (101). 

Consider  first  the  case  for  n  =  1.  From  Eq.  (102)  it  follows  that  k^.  must  satisfy  the  relation 

(l  ±  exp  [j27rkli]}  =  0  .  (103) 

However,  as  noted  previously,  X .  is  the  energy  transfer  ratio  of  the  filter  when  y^(t)  is  trans¬ 
mitted.  Since  the  filter  is  normalized  so  that 

max  | H(f)  |  =  1 
f 

it  follows  that  X.  <  1.  Thus,  k . .  =  0  is  not  an  allowed  solution  to  Eq.  (103)  and  the  final  result  is 

ill 

ku  =  i/2  i  =  1.  2,  .  .  . 


or 


X.  = 

l 


i  + 


(104) 


which  agrees  with  the  result  found  on  page  67  after  the  introduction  of  a  bandwidth  scale  factor 
of  2 7r .  Substitution  of  this  expression  into  Eq.  (91)  shows  that 


C' 


max  ^  log^ 

3 


S3 


i  +  <^>2 


bits/second 


This  expression  is  plotted  in  Fig.  10  along  with  the  value  of  C  =  R(0).  It  should  be  noted  that 
this  technique  is  inferior  to  the  optimum  technique  by  approximately  3  db  for  S/2  =  10  and  by 
nearly  7  db  for  S/2  =  10^. 

To  investigate  the  performance  for  higher  order  channels,  it  is  necessary  to  solve  for  k^. 
from  Eq.  (102).  Since  this  becomes  quite  tedious  manually  for  n  >  2,  the  evaluation  has  been 
accomplished  numerically  on  a  digital  computer.  The  result  is  that  to  two  significant  figures 
and  (at  least)  for  n  <  10,  k^  is  given  by 
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Fig.  10.  C,  C',  and  Cj  for  channel  with  N(f)  =1  and  |H(f)|  2  =  [1  +  f2]’1 . 


Fig.  11.  C,  C',  and  for  channel  with  N(f)  =  1  and  |  H(f)  =  [1  +  f^]  \ 
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i  =  1.2. ... 


Thus, 


kni  =  I  (l(n  “  1}  +  i] 


Xi  = 


A  X 

(>-.>.  ifl 

1  T 

2J  ] 

(105) 


and  Eq.  (91)  becomes 

C*  =  max  log2  ^  +  SJ  [1  +  (^y^)2n]~^}  bits/second 

Figure  11  presents  C'  for  a  Butterworth  channel  with  n  =  10  together  with  the  value  of  C  =  R(0) 
calculated  from  Eqs.  (37)  to  (39).  It  should  be  noted  that  although  C*  C  for  S  0,  there  is  an 
effective  signal  power  loss  of  25  db  for  S/2  =  10  . 

This  extremely  poor  performance  at  only  moderately  large  signal  powers  when  coupled  with 
the  lesser  but  still  significant  loss  for  the  simple  n  =  1  channel  suggests  that  this  suboptimum 
modulation  technique  is  of  limited  practical  interest.  The  following  paragraph  considers  an 
extension  of  the  present  technique  that  leads  to  significantly  improved  performance  at  the  ex¬ 
pense  of  an  increase  in  equipment  complexity. 

Recall  from  Eqs.  (80)  and  (84)  that  the  (y^t)}  are  orthonormal  and  are  "doubly  orthogonal" 
at  the  channel  filter  output.  Because  of  these  properties  it  is  possible  to  substitute  for  the 
{</K(t)}  not  just  time  translates  of  the  single  function  y^t)  but  instead  time  translates  of  the 
first  few  of  the  {y^(t)}.  When  this  is  done,  J  can  be  increased  and  improved  performance  ob¬ 
tained.  More  specifically,  consider  the  following  situation.  Let  T  be  fixed,  assume  that 
T  =  kJ  with  k  an  integer,  let  the  (<pJ>t)}  be  replaced  with  time  translates  of  the  first  N'  of  the 
{y^t)},  and  let  N'  be  chosen  to  maximize  the  error  exponent.  It  then  follows  from  the  discussion 
of  Eq.  (91)  and  Eqs.  (32)  and  (35)  that  the  value  of  C  for  this  modulation  technique,  say  C  ,  is 

i  v  x- 

C7  =  73  L  l°g2  B  (o)  bits/second  (106) 

i=l  ? 


sj+  £  x'1 

1  a  _ ifl _ 

B^O)  —  N* 

and 

XN'  >  By(°)  >xn'+i 

and  the  {\.}  are  eigenvalues  of  Eqs.  (92)  and  (93).  Equation  (106)  is  plotted  in  Figs.  10  and  11 
for  the  n  =  1  and  n  =  10  Butterworth  channel,  respectively,  using  the  eigenvalues  of  Eq.  (105). 
Observe  from  Fig.  10  that  for  the  n  =  1  channel  and  with  3*  =  1,  use  the  first  twelve  of  the  {y.(t)} 
gives  a  modulation  system  whose  performance  is  within  1  db  of  ideal  at  S/2  =  10  .  Figure  11 
demonstrates  a  similar  improvement  for  the  n  =  10  channel  except  that  for  this  channel  both  J 
and  N'  must  be  significantly  greater  than  for  the  n  =  1  channel  to  achieve  the  same  level  of 
performance. 
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In  conclusion,  two  points  should  be  made  concerning  this  modulation  technique. 

(1)  As  suggested  in  Fig.  11,  it  is  possible  by  using  increasingly  large 
values  of  3"  and  N'  to  obtain  a  modulation  system  for  which  Cy  is 
arbitrarily  close  to  C.  As  a  practical  matter,  it  appears  that  a 
value  of  J  between  one  and  twenty  times  the  reciprocal  of  the 
filter  3-db  bandwidth  is  sufficient  to  achieve  most  of  the  improve¬ 
ment  possible. 

(2)  Since  the  determination  of  the  { Ti (t)}  becomes  increasingly  difficult 
for  higher  order  channels,  and  since  a  greater  number  of  the  {y^t)} 
are  required  for  efficient  operation  with  these  channels,  it  is  clear 
that  this  modulation  technique  can  be  considered  practical  only  for 
low-order  channels. 


B.  RECEIVER  FILTER  DESIGN  TO  ELIMINATE  INTERSYMBOL  INTERFERENCE 

The  previous  section  has  considered  a  suboptimum  modulation  technique  that  eliminates 
intersymbol  interference  by  means  of  a  suitable  choice  of  the  transmitted  signal.  This  section 
considers  an  alternate  approach  in  which  the  receiver  matched  filter  is  replaced  by  a  filter  that 
has  been  designed  to  maximize  SNR  and  eliminate  intersymbol  interference.^ 

Consider  again  the  channel  of  Fig.  1,  let  N(f)  be  arbitrary,  and  assume  that  a  signal  x(t)  is 
given.  For  this  situation  it  is  desired  to  design  a  receiver  filter  h^(t)  so  that  when  x(t)  is  trans¬ 
mitted  the  filter  output  will  be  zero  at  t  =  kj,  k  =  ±1,  .  .  .  ,  ±N,  and  nonzero  at  t  =  O.t  In  this 
manner,  it  will  be  possible  to  transmit  time  translates  (by  kj  seconds)  of  x(t)  without  incurring 
intersymbol  interference.  Since  there  are,  in  general,  an  infinite  number  of  filters  having  this 
property,  it  is  desirable  to  choose  the  filter  that  maximizes  the  SNR  at  t  =  0.  This  problem  is 
readily  solved  using  standard  techniques  of  the  calculus  of  variations When  x(t)  is  trans¬ 
mitted,  the  signal  portion  of  the  output  of  h^(t),  say  z(t)  is 


z(t)  =  f  X(f)  H(f)  H1(f)  eja,t  df 


(107) 


where  X(f),  H(f),  and  H^f)  are  the  Fourier  transforms  of  x(t),  h(t),  and  h^t),  respectively. 


Also,  the  noise  output,  say  nQ(t),  is 


/"»0O 

n  (t)  =  \  n(a)  h.(t  —  a)  da 
°  J-oo  1 


For  this  problem,  a  useful  SNR  definition  is 


SNR  = 


 z*(0) 


[/  X(f)  H(f)  H^f)  dfj2  [/ 


X(f)  H(f)  Ht(f)  df 


[n^(t)]  ^  f  J  n(a)  n(p)  h(t  -  ct)  h(t  —  p)  dcrdp  /  N(f)  |  H t ( f )  |  2  df 


(108) 


-OO  -OO 


- 1 -  .  93 

t  To  the  author's  knowledge,  this  problem  was  first  considered  by  Tufts.  The  work  presented  here  represents  an 
alternate  and  somewhat  simplified  derivation  of  his  result  and  in  addition  provides  some  insight  into  the  SNR  deg¬ 
radation  caused  by  the  elimination  of  intersymbol  interference. 

t  The  number  N  is  arbitrary  at  this  point.  Appropriate  values  will  be  indicated  later  when  specific  examples 
are  considered. 
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Thus,  the  problem  is  that  of  choosing  H^f)  to  maximize  Eq.  (108)  under  the  constraints  z(k!T)  = 
0,  k  =  ±  1,  .  .  .  ,  ±N.  Use  of  Lagrange  multipliers  shows  that  H^(f)  must  be  chosen  to  minimize 
the  functional  t 


I  = 


N(f) 


N 

| H1(f)  |  2  -  X(f)  H(f)  H1(f)  Yj  20keja,k!J 

k=-N 


df 


This  minimization  is  accomplished  in  the  usual  manner  by  substituting  H^f)  =  HQ(f)  +  eH^(f), 
where  H^(f)  is  the  desired  optimum  filter,  and  setting 


Upon  performing  the  substitution  it  is  found  that 

N(f)  (Ho(f)  H*(f)  +  H*  (f)  H2(f)]-X(f)  H(f)  H2(f)  £  zPk  eja,kJ|  df  .  (109) 

k  J 

However,  since  real-time  functions  are  assumed  and  N(f)  is  a  power  spectral  density. 


dl 

de 


poC 

E=0 


N(f)  H  (f)  H*(f)  df  = 
o  d 


N(— f)  H  (— f)  H*(-f)  df  = 
o  d 


N(f)  H*(f)  H?(f)  df 
o  d 


Thus,  Eq.  (109)  can  be  written 


dl_ 

de 


pOO 

--  2  J  H2<f) 

N(f)  H*  (f)  -X(f)  H(f)  Yj  Pk 

€  =  0 

k 

df 


(110) 


Upon  requiring  that 


dl_ 

de 


e=0 


for  all  H^(f)  it  follows  that  the  bracketed  quantity  in  the  integrand  of  Eq.  (110)  must  be  zero  for 


all  f,  i.e.,  HQ(f)  must  be  given  by 


N 


„o,n  =  2 


(111) 


k=-N 


It  is  interesting  to  note  that  HQ(f)  can  be  realized  as  the  cascade  of  the  optimum  detector  for 
colored  noise  followed  by  a  "zero  forcing"  tapped  delay  line. 

Next,  it  is  necessary  to  solve  for  the  (/?k)  that  satisfy  the  intersymbol  interference  con¬ 
straints.  From  Eq.  (107)  the  output  of  hQ(t)  at  t  =  iJ  is 


2 

t  Observe  that  choosing  H,(f)  to  minimize  [n  (t)] 
Eq.  (108).  ° 


with  z(0)  fixed  is  equivalent  to  choosing  H^(f)  to  maximize 
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where 


w  A  r  [_X(f> 
ik  J_„  N(f) 


exp  [jcj(i  —  k)  J]  df 


Thus,  if  the  column  vectors  z  and  p  are  defined  as 


z(-Nj) 

z  A  . 

z(0) 

a  a  . 

.  z(N J)  . 

and  if  a  matrix  [W]  is  defined  as 


£ 

i 

£ 

-N  *  * 

•  W-N,0  •• 

•  W-N,  N 

NO 

wo,. 

-N  ‘  ‘ 

3 

o 

o 

•  W0,N 

WN, 

-N  ’  1 

W 

*  WN,  o  ‘ 

WN,N 

it  follows  that 

z  =  [W]£  (113) 

93  - 1 

Since  [W]  is  known  to  be  nonsingular,  the  inverse  matrix  [W]  exists  and  Eq.  (113)  becomes 

(W)'1  z  =  IW]"1  [W]  0 
or 

p  =  [W]"1  z  .  (114) 

This  is  the  desired  expression  that  gives  the  tap  gain  settings  p  in  terms  of  the  constraint 
equations  t 


t  Recall  that  in  the  maximization  procedure  z(0)  was  fixed  but  arbitrary.  Since  the  value  of  z(0)  has  no  effect 
on  the  SNR,  the  normalization  z(0)  =  1  has  been  assumed. 


z(kJ)  = 


1  k  =  0 

0  k  41  0  (115) 

Finally,  it  is  necessary  to  evaluate  the  SNR  for  this  filter.  From  Eqs.  (108)  and  (115)  this  is 

expU-.U- Wl-fl-1 

li,  k  “°°  ' 

or,  from  Eqs.  (113)  and  (115) 


SNR  = 


SNR  =  /3J1  .  (116) 

54 

For  comparison  purposes,  it  should  be  recalled  that  the  SNR  at  the  output  of  the  optimum 
detector  for  colored  noise  and  an  infinite  observation  interval  is 


SNR 


o 


f"  |X(f)H(f)|2 


Thus,  if  X(f)  is  assumed  normalized  so  that  SNR|o  =  1./3J1  will  be  the  ratio  of  the  SNR  for  the 
zero  forcing  filter  to  that  for  the  optimum  detector.  Since  this  interpretation  of  /3J1  is  quite 
useful,  the  following  examples  assume  X(f)  to  be  so  normalized. 

Ideally,  the  performance  of  this  modulation  technique  would  now  be  evaluated  by  comparing 
the  value  of  C  for  this  approach  with  that  obtained  for  the  optimum  technique.  Unfortunately, 
when  X(f),  H(f),  and  N(f)  are  chosen  so  that  the  determination  of  /3  from  Eq.  (114)  is  feasible, 
it  is  impractical  to  determine  C  for  the  optimum  technique.  Conversely,  when  H(f)  and  N(f) 
are  chosen  to  simplify  evaluation  of  C  for  the  optimum  technique,  it  is  impractical  to  evaluate 
from  Eq.  (114).  Thus,  it  is  necessary  to  consider  an  alternate  and  simpler  evaluation  of  the 
zero  forcing  technique.  A  relatively  simple  approach  is  to  choose  X(f),  H(f),  and  N(f)  so  that 
/3q  can  be  readily  determined  from  Eq.  (114).  Then  since  1  represents  the  SNR  degradation 
caused  by  the  elimination  of  intersymbol  interference,  it  seems  reasonable  to  conclude  that  if 
for  a  given  situation  is  close  to  unity  the  performance  of  this  technique  would  be  acceptable. 
Conversely,  if  a  situation  is  found  in  which  1  is  quite  small,  this  technique  would  be 
unacceptable. 

Recall  from  Eq.  (Ill)  that  the  zero  forcing  filter  can  be  realized  as  the  cascade  of  the  op¬ 
timum  detector  and  a  zero  forcing  tapped  delay  line.  Viewed  in  this  manner,  it  appears  that 
the  zero  forcing  procedure  should  lead  to  an  excessive  SNR  loss  only  if  the  intersymbol  inter¬ 
ference  at  the  detector  output  is  in  some  sense  large.  The  following  examples  confirm  this 
speculation. 

Let  W(t)  be  the  output  of  the  optimum  detector  when  x(t)  is  transmitted,  i.e.. 


W(t)  = 


I  x(f)  H(f)  [  2  jwt  f 

W5  a 


and  consider  first  the  situation  in  which  X(f),  H(f),  and  N(f)  are  chosen  so  that  W(t)  appears  as 
in  Fig.  12.  Then  the  elements  in  the  [W]  matrix  are 


W 


ik 


i  =  k 
i  =  k  ±  1 
otherwise 
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Fig.  12.  Example  in  which  intersymbol 
interference  is  large. 


Calculations  for  N  =  1,  2,  3  suggest  that  for  this  [W]  the  {/3^}  are  given  by 

/3k  =  [N  +  1  —  | k |  ]  .  (117) 

Substitution  of  Eq.  (117)  into  Eq.  (113)  shows  that 

N 

z(iJ)=  £  \Wik=  (-D*11  (Ml  -J  U  —  1 1  -  I  |i+  1 1 1  |i|<N 

k=-N 

1  if  i  =  0 

=  Ul  <N 

0  otherwise 

which  proves  that  Eq.  (117)  is  valid  for  arbitrary  N.  Observe  from  Eq.  (117)  that  |/3±^|  =  1. 

Thus,  the  intersymbol  interference  between  x(t)  and  x[t  ±  (N  +  1)3*]  is  independent  of  N.  As  a 
result,  it  would  be  necessary  to  set  N  =  «  before  zero  intersymbol  interference  would  be  ob¬ 
tained  for  all  sampling  instants.  However,  Eq.  (117)  shows  that  p  =  N  +  1.  Therefore,  as 

-1  ° 

N  -*■  pQ  -*  0,  i.e.,  the  SNR  loss  caused  by  the  elimination  of  intersymbol  interference  be¬ 
comes  arbitrarily  large  as  the  intersymbol  interference  is  eliminated  for  an  arbitrarily  large 
number  of  sampling  instants.  Thus,  for  this  example,  zero  forcing  is  completely  unacceptable 
as  a  technique  for  eliminating  intersymbol  interference. 

Another  example  that  gives  some  additional  insight  into  the  zero  forcing  filter  is  the  following. 
Let  X(f),  H(f),  and  N(f)  be  chosen  so  that  W(t)  appears  as  in  Fig.  13.  Then  the  elements  of  [W] 

i  =  k 
i  =  k  ±  1 
i  =  k  ±  2 
otherwise 

that  the  {/?k}  given  by  the  following  relations  satisfy  Eqs.  (113) 


W..  = 
lk 


1 

0 

-1/4 

0 


For  this  [W]  it  is  readily  verified 
and  (115). 
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Fig.  13.  Example  in  which  intersymbol 
interference  is  small. 


Pk  =  0,  k  odd 
%  =  l/D 

0N-2  = 4/D 
0N_4  =  15/0 

Pk  =  4^k+2  “  ^k+4  for  k  even  ^  0  <  k  <  N  “  4  (118) 

D  is  determined  from  the  relation  /3q  —  i  02  =  1.  Thus,  if  N  =  6,  for  example,  the  {/?k}  are 

^±1  -  ^±3  ~  ^±5  ”  ^ 

PQ  =  112/97,  ^±2  =  30/97,  0±4  =  8/97,  0±  6  =  l/97  . 

Two  important  results  can  be  obtained  from  Eq.  (118).  First,  observe  that  0±N  =  l/D.  Although 
no  general  expression  for  D  has  been  found,  calculations  show  that  D  grows  rapidly  with  N.t 
Thus,  for  only  moderately  large  N,  the  intersymbol  interference  between  x(t)  and  x[t  ±  (N  +  1)J] 
will  be  negligible.  This  implies  that  from  a  practical  standpoint,  zero  intersymbol  interference 
can  be  obtained  at  all  sampling  instants  by  means  of  a  finite  length  tapped  delay  line.  This  result, 
which  is  of  considerable  practical  importance,  should  be  contrasted  with  the  analogous  result 
in  the  previous  example.  Second,  observe  that  can  be  written  as 


t  For  example,  the  following  results  can  be  determined  from  Eq.  (118). 

N:  2  4  6  8  10 

D:  7  15  48.5  181  675.5 
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where  nQ  and  are  the  numerators  of  /3Q  and  p^,  respectively,  i.e.,  Pq  =  nQ/D  and  P^  ~  n2  /D. 
Use  of  this  result  together  with  the  iterative  relations  of  Eq.  (118)  shows  that,  to  slide  rule 
accuracy. 


p  «  1.15  for  all  N 

o 


Thus,  the  SNR  loss  due  to  the  elimination  of  intersymbol  interference  is  negligible  and  it  may 
be  concluded  that  for  this  example  zero  forcing  is  a  satisfactory  technique  for  eliminating  inter¬ 
symbol  interference. 

In  summary,  it  appears  from  the  previous  examples  that  zero  forcing  is  a  suitable  technique 
for  eliminating  intersymbol  interference  when  the  intersymbol  interference  at  the  optimum  de¬ 
tector  output  is  "suitably  small,"  i.e.,  when  the  output  has  a  ringing  form  as  suggested  by  Fig.  13. 
Conversely,  when  the  intersymbol  interference  is  "large"  as  suggested  by  Fig.  12,  the  zero 
forcing  procedure  causes  an  excessive  loss  in  SNR. 

C.  SUBSTITUTION  OF  SINUSOIDS  FOR  EIGENFUNCTIONS 

The  previous  sections  have  considered  two  approaches  to  the  elimination  of  intersymbol 
interference  in  suboptimum  modulation  systems.  As  indicated,  the  first  of  these  approaches 
appears  practical  for  relatively  simple  channels  (an  n  =  1  Butterworth  filter  and  white  noise) 
while  the  second  appears  practical  for  "almost  flat  and  band-limited"  channels,  i.e.,  channels 
for  which  the  intersymbol  interference  at  the  optimum  detector  output  is  "relatively  small." 

This  section  considers  a  third  technique  that  appears  practical  for  the  range  of  channels  be¬ 
tween  these  extremes. 

Recall  that  a  study  of  suboptimum  modulation  systems  involves  finding  "suitable  substitutes" 
for  the  eigenfunctions  of  Theorems  1  to  3.  The  present  section  considers  the  possibility  that 
suitable  substitutes  are  time  translates  of  a  set  of  sinusoids.  The  motivation  for  this  approach 
is  provided  in  the  following  discussion. 

1.  Asymptotic  Form  of  Eigenfunctions  and  Eigenvalues 

This  section  is  concerned  with  an  investigation  of  the  form  of  the  {</^(t)}  and  {X^}  of  Theo¬ 
rems  1  to  3  for  T  00 .  Consider  first  the  {<^(t)}  of  Theorem  1  and  recall  that  they  satisfy  the 
integral  equation 


(119) 


where 


^o 

Since  the  interval  over  which  the  {^(t)}  are  defined  has  no  effect  on  the  form  of  the  {<p.(t)},  it 
is  convenient  to  consider  the  following  equation  instead  of  Eq.  (119) 


^T/2 

-T/2 


(120) 


where  now 
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h(a  —  t)  h(cr  —  s)  da 


a  rTrT/2 

K(t,  s)  a  \ 

J-T/2 


pT'/Z 

=  \  h(a  — t)  h(a— s)  da 

J-T'/2 


with  T'  ^  —  T/2.  The  last  equality  follows  from  the  fact  that  h(t)  is  physically  realizable 

and  >  T  is  assumed.  Now,  let  (p  A, t)  be  represented  by  a  Fourier  series,  i.e.,  let 


*i(t)  =  Z  aike 


K4 


Wu  ‘Hr  |t|  <  T/2  . 


k  T 


k=-«0 


(121) 


Substitution  of  this  expression  into  Eq.  (120)  shows  that 


\  Z  a  -  e 


jV 


ik 


pT/2  jw.s 

=  Z  aik  3_t/2  e  K(t.  s)  ds 


k  k 

Multiplication  of  both  sides  by  exp  [ — jco^t ]  and  integration  over  [—T/2,  T/2]  gives 

i  v  rT/2  rT/2  jwks  -j<v 

Vif  =  T  Z  aik  \  \  .  e  K(t,s)  e  ds  dt 


-  k 


-T/2  -T/2 


(122) 


or,  from  Eq.  (120) 


i  v  rT'/2  f  rT/z  Ks  dsl  f  rT/2 

Vii =  T  Z  aik  lT,/2  [IT/2  h(ff  s)  e  J  [IT/2 


h(cr  —  t)  e  dt 


da 


Since  the  bracketed  terms  are  simply  the  output  of  the  filter  h(t)  when  the  input  is  of  the  form 
exp[jcdt],  — T/2  <  t  <  T/2,  and  since 

’T/2  ju  t  . 


[ 


kl  -iojt  sinjr(f  —  f.)  T 

T/2e  at’T  ,»-fk)T  ■  T.inc(f-fk)T 


it  follows  that 


i  v  rT'/2  [  r 
xkau  -  t  Z  aik  j_T,/2  [T 


H(f)  sine  (f  —  f,  )  T  eiZlrt<T  df 


/-»oo 

T  \  H(i>)  sine  [v  +  f  )  T  e 

J— OO 


j2?r^a 


dv 


da 


or 


r»oo  /-too 

Yj  \  \  H(f)  H(y)  sine  (v  +  f  )  T  sine  (f  —  f.  )  T  sine  (f  +  v)  T*  df  dv  .  (123) 

oo  *^-oo 


Vi* =  TT' 


Consider  the  integration  over  f  and  observe  that  unless  v  ^  — f^  the  integral  is  approximately 
zero.  Furthermore,  if  T1  is  assumed  to  be  large  enough  so  that  H(f)  is  constant  in  both 
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amplitude  and  phase  over  a  band  of  frequencies  several  times  l/T'  cps  wide  about  f  =  —  v ,  it 
follows  that  when  v  f^,  Eq.  (123)  can  be  writtent 


X.a..=  T  2  a.,  f  |  H(i^)  |  2  sinc^  +  ff)  T  f  T'  sine  (f  -  fR)  T  sine  (f  +  v)  T'  dfdi>  .  (124) 

*  J-oo  '-'-00 


Since  T1  >T, 


r\oo 

\  T’  sine  (f  —  f^)  T  sine  (f  +  v)  T'  df  =  sine  (f^  +  v)  T 

^-00 


and  Eq.  (124)  becomes 

Xiaii  =  T  Z  aik  ^  |H(^)|2  sine  (v  +  f^)  T  sinc(i>  +  fk>  T  dv  (125) 

k 

If  the  additional  assumption  is  now  made  that  T  is  large  enough  so  that  |H(y)|2  is  essentially 
constant  over  a  band  of  frequencies  several  times  l/T  cps  wide  centered  about  v  -  —  f^,  Eq.  (125) 
can  be  written 


Vi/ =  £ 

k 


T  sinc(y  +  f^)  T  sincCy  +  f^)  T  dv 


=  a 


ii 


|H(ff)|2 


94 

where  the  last  step  follows  from  the  known  orthogonality  of  sine  functions.  From  this  it  follows 
that  either  X ^  =  j H( f^)  |  or  a^  =  0,  i.e.,  both  expfjcj^t]  and  exp  [—  jeo^t]  (or  equivalently  sinco^t  and 
cos  a^t)  are  eigenfunctions  with  eigenvalue  |H(f^)|2.  Thus,  when  T  and  T'  are  large  enough  to 
satisfy  the  assumed  conditions,  the  {<p.(t)}  and  {x^}  of  Theorem  1  are,  to  a  good  approximation, 
given  by^ 


U\  _  [Z~  .  27r it 

“^2i-l  1  '  7t  Sln  T 

/.  X  fz~  ZjnX 

V2i(t)  =  J  T  COS  ~f~ 


(126) 


and 

X2i  =  X2i-1  =  tH<  t  '  I 

This  is  the  desired  asymptotic  form  for  the  eigenfunctions  and  eigenvalues  of  Theorem  1.  Some 
additional  insight  into  the  value  of  T^  required  to  make  this  result  valid  can  be  obtained  by  noting 
from  Eq.  (120)  and  Fig.  14  that  if  T^  >  T  +  T^,  where  T^  is  the  "duration"  of  h(t),  to  a  good 
approximation,  and  for  —  T/2  ^  t,  s  <  T/2, 


t  Note  that  In  a  strict  sense,  Eq.  (124)  Is  true  only  In  the  limit  T'  -►  oo.  However,  it  is  clear  that  when  T*  Is 
sufficiently  large,  the  error  will  be  negligible. 

tin  this  and  subsequent  discussions  it  is  convenient  to  order  the  {X.}  in  the  manner  of  Eq.  (126)  rather  than  in  the 
conventional  manner  of  Xj  ^  X2  ^  •  •  •  •  li  should  be  mentioned  that  other,  closely  related,  asymptotic  results 
have  been  obtained  by  Capon95  and  Rosenblatt?6  for  the  case  of  arbitrary  T  and  i  00. 
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Fig.  14.  Concerning  form  of  K(t,s)  when 
is  large. 


K(t,  s)  =  K(t  -  s) 


where 


K(t  -  s)  =  f  |  H(f)  |  2  exp  [ja;(t  -  s)  ]  df 
^_00 


Substitution  of  this  result  into  Eq.  (122)  leads  directly  to  Eq.  (125). 
Consider  next  the  (<p/t)}  and  {x^}  of  Theorem  2  and  recall  that 

-T 
^o 

where 

|H(f)  |  2  Jut 


=  JT  <Pi(T)  K(t-r)  dT  0<t«T 


K(t) 


Nff) 


eJU/t  df 


(127) 


As  before,  it  is  convenient  to  shift  the  time  origin  so  that  the  {<p^(t)}  are  defined  over  [— T/2,  T/2] 
and  therefore  satisfy  the  equation 

fT/2 

A.tf.(t)  =  \  (pAs)  K(t-s)  ds  1 1 1  <  T/2  (128) 

J-T/2  1 

Following  the  previous  procedure,  let  <p^(t)  be  written  as 
«Pi<t)=  Z  |t|  <  T/2 

k=-oo 


and  substitute  this  expression  into  Eq.  (128).  This  gives,  after  multiplication  by  exp  [—  juft]  and 
integration  over  [—T/2,  T/2], 


vu  -U*it  [£ 


■T/2 

T/2 


joj.  s 

K(t  -  s)  e  ds 


dt 


Since  the  bracketed  term  is  simply  the  output  of  the  filter  K(t)  when  the  input  is  exp  [jw^t], 

—  T/2  <  t  <  T/2,  and  since  the  Fourier  transform  of  this  input  is  T  sine  (f  —  f^)  T,  it  follows  that 
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k 

T 


r°°  |  xt / f* \  i  ^  pT/2 

xiau =  £  aik  HM-  sinc  (f  -  fk> T  df  iT/2  exp  [-j(w  +  "i1  *i dt 


r»oo 

X>iJ 


sinc  (f  -  fk)  T  sinc  (f  +  f()  T  df 


Assume  now  that  T  is  large  enough  so  that  | H( f)  |2/N(f)  is  essentially  constant  about  f  =  — f^. 
Then,  with  negligible  error, 


Vu =  2 

k 


ik 


I  H(ff)  |  2  p 

N<V  J- 


T  sinc  (f  -  f^)  T  sinc  (f  +  f  )  T  df 


|  H(f^)  |  2 
=  aif  N(fj) 

Thus,  by  analogy  with  the  discussion  preceding  Eq.  (126),  it  follows  that  when  T  satisfies  the 
assumed  condition,  the  {<^(t)}  and  {x^}  of  Theorem  2  are,  with  negligible  error,  given  by 


which  is  the  desired  asymptotic  form.  At  this  point,  it  is  sufficient  to  state  that  arguments 
similar  to  those  used  above  show  that  for  T  and  T1  sufficiently  large,  the  {(p^[ t)}  and  {x.}  of 
Theorem  3  are  also  given  by  Eq.  (129).  From  this  the  important  conclusion  is  reached  that  in 
all  cases  the  {<p^(t)}  become  sinusoids  when  T  -»  °°.  Furthermore,  some  idea  of  the  value  of  T 
required  to  make  this  approximation  valid  has  been  obtained.  This  result  together  with  the  re¬ 
sulting  simple  form  of  the  eigenvalues  will  prove  useful  in  deriving  a  "good"  suboptimum  modu¬ 
lation  system.  Before  proceeding  with  such  a  derivation,  however,  it  is  important  to  obtain 
some  insight  into  the  manner  in  which  the  </^(t)  differ  from  sinusoids  when  T  is  finite;  this 
difference,  although  quite  small,  has  been  found  to  be  important  in  an  experimental  system. 

For  this  discussion,  let  T^  be  infinite,  consider  the  {</^(t)}  of  Theorem  1  and  recall  from 
Sec.  II-D-4  that  the  {^(t)}  are  self-reproducing  [over  the  interval  (0,  T)]  when  passed  through 
the  cascade  of  h(t)  and  h( —  t) ,  i.e.,  when  passed  through  a  filter  whose  impulse  response  is  the 
autocorrelation  function  Ftj^(t)  of  the  channel  filter.  Next,  observe  as  suggested  in  Fig.  15,  that 
when  T  is  long  relative  to  the  duration  of  R^(t)  and  a  sinusoid  of  length  T  is  passed  through 
R^(t),  the  output  signal  differs  from  a  sinusoid  only  near  the  ends  of  the  interval.  Thus,  since 
the  previous  work  has  shown  that  for  large  T  the  {(^(t)}  are  approximately  sinusoids,  and  since 
sinusoids  are  self-reproducing  except  near  the  ends  of  the  interval,  it  follows  that  for  large  but 
finite  T  the  {</? ^( t) }  differ  from  sinusoids  only  near  the  ends  of  the  interval. 
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f\  A  A  A  A  A  r 
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L OUTPUT  OF  R 

wur 

1  i  r- 

W  T  0 

Fig.  15.  Concerning  form  of  the  {<J>.(t)}  when  T  is  large. 

2.  Modulation  System 


This  section  is  concerned  with  the  derivation  of  a  third  suboptimum  modulation  system 
based  on  the  results  of  the  previous  section  and  those  in  Appendix  B. 

2 

Consider  first  the  analysis  of  Appendix  B  and,  for  a  given  |H(f)|  /N(f),  let  be  chosen 
so  that  |H(f)|2/N(f)  is  essentially  constant  over  an  interval  of  l/T^cps.  Then,  from  the  dis¬ 
cussion  preceding  Eq.  (B-5),  it  is  clear  that  for  T  >  Tm  the  sums  in  the  error  exponents  for 
finite  T  will  differ  negligibly  from  the  limiting  integral  forms.  Next,  recall  from  Sec.  IV-C-1 
that  for  T  >  Tm  the  {<p.(t)}  are,  to  a  good  approximation,  sinusoids.  Thus,  since  T  might  typ¬ 
ically  be  on  the  order  of  days,  weeks,  or  months  and  since,  at  least  for  telephone  channels, 

Tm  is  on  the  order  of  0.01  to  0.1  second,  it  seems  reasonable  to  attempt  to  substitute  for  the 
{<pj(t)}  time  translates  of  a  set  of  sinusoids  approximately  Tm  seconds  long.  In  other  words, 
if  a  set  of  {o^(t)}  are  defined  as 


a 


2i-l 


(t) 


4 


Jf  sini 


2?rit 

T 


0 


|t|  <  3-/2 

7  »  T 

elsewhere 


a 


2i 


;(t> 


0 


|t|  <  J/2 
elsewhere 


(130) 


the  would  be  replaced  by  time  translates  (by  kT  seconds,  k  an  integer)  of  some  number 

of  the  {a^(t)}.  However,  when  this  is  done  two  forms  of  intersymbol  interference  are  encountered. 
One  form  occurs  because  the  fo^(t)}  are  only  approximations  to  a  set  of  {<p^(t)}  of  length  J  and 
therefore  are  only  " approximately "  orthogonal  at  the  channel  output.  The  second  form  arises 
from  the  nonorthogonality  at  the  channel  output  of  nonequal  time  translates  of  any  two  of  the 
{c^(t)}.  Both  forms  of  intersymbol  interference  can  be  reduced  significantly  in  the  following 
manner. 

Let  white  noise  and  an  infinite  observation  interval  be  assumed^  and  recall  from  Sec.  II-D-4 
that  the  "coordinate  filters"  for  this  situation  may  be  realized  as  the  cascade  of  a  filter  with 


t  The  assumption  of  white  noise  is  made  here  only  to  simplify  the  subsequent  discussion.  The  results  obtained  can 
be  applied  to  the  colored  noise  problem  by  simply  replacing  the  filter  with  impulse  response  h(-t)  by  a  filter  with 
transfer  function  H*(f)/N(f).  The  assumption  of  an  infinite  observation  interval  is  made  because  it  is  usually  quite 
easy  to  make  the  interval  long  enough  to  be  infinite  for  all  practical  purposes  and  because  implementation  prob¬ 
lems  are  simplified  when  this  is  done. 
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impulse  response  h( — t)  followed  by  a  multiply  and  integrate  operation.  Also,  observe  that  if  T 
in  Eq.  (130)  is  considerably  greater  than  the  duration  of  the  impulse  response  of  the  cascade  of 
h(t)  and  h(— t)  [if  J  is  considerably  greater  than  the  duration  of  the  filter  autocorrelation  function 
R^t)]  when  a.(t)  is  transmitted  the  corresponding  signal  at  the  output  of  h( — t)  will  be  essentially 
an  undistorted  sinusoid  except  near  the  ends  of  the  interval.  Thus,  if  3^  is  the  approximate 
duration  of  R,  (t)  and  if  the  {a.(t)}  are  redefined  as 


a2i-l(t)- 


«2i(t)  = 


/  2  .  1 

[  27Tit  1 

sin  1 

ly-jgJ 

|t|  <  J/Z 


elsewhere 


’  cos  U^l  |t|  <  J/2 

J  j-yg  Ij- 


elsewhere 


(131) 


it  follows  that  by  observing  the  output  of  h(— t)  only  over  the  ” inner  interval”  of  |t|  <  (3*  —  3*g)/2 
the  orthogonality  of  the  {o^(t)}  at  the  channel  output  will  be  greatly  improved.  In  addition,  when 
time  translates  of  any  two  of  the  {or i( t) }  are  transmitted,  this  same  technique  gives  a  significant 
reduction  in  the  intersymbol  interference  between  successive  signals.  The  following  example 
provides  some  quantitative  insight  into  the  effectiveness  of  this  procedure.  [Note  that  if  R^(t) 
were  identically  zero  for  |t|  >  3*  ,  all  intersymbol  interference  would  be  completely  eliminated.] 
Consider  a  channel  for  which 


h(t)  = 


t  >0 
t  <  0 


and  let  denote  the  magnitude  of  the  number  obtained  wheji  the  output  of  h(— t),  assuming  a ^(t) 
is  transmitted,  is  multiplied  by  o',  (t)  and  integrated  over  the  interval  |t|  ^  (T  —  3*  )/2.  (Observe 

K  S 

that  for  i  ^  k,  provides  a  measure  of  the  intersymbol  interference  caused  by  the  nonorthogo¬ 
nality  of  the  {or .( t) }  at  the  channel  output.)  For  this  channel  and  for  i  ^  k,  I.,  can  be  upper 

1  IK 

bounded  by 


rik 4  cr  -  y 


Thus,  the  intersymbol  interference  decreases  almost  exponentially  with  increasing  3"  and  is 
inversely  proportional  to  the  duration  ?  of  the  {cv^(t)}.  For  this  channel,  reasonable  values  of 
Tg  and  J  might  be  8  and  80  seconds,  respectively ,t  For  these  values, 

I.,  £  5.2  X  10'4 
ik 


-1 

and  since  =  [1  +  uk]  <;  1,  it  follows  that  the  intersymbol  interference  can  be  considered 
negligible  except  for  extremely  high  SNR  conditions. 


t  The  selection  of  values  for  J  and  3g  Involves  a  trade-off  between  equipment  complexity.  Intersymbol  inter¬ 
ference  reduction,  and  loss  in  effective  signal  power.  This  point  is  considered  In  more  detail  later  for  telephone 
channels.  The  values  assumed  here  are  of  the  same  order  of  magnitude  (after  an  appropriate  bandwidth  scale 
factor  is  introduced)  as  those  selected  for  the  telephone  line  modulation  system. 
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As  the  next  step  in  the  evaluation  of  this  modulation  system,  it  is  of  interest  to  investigate 
the  performance  that  might  be  expected  when  it  is  used  with  a  telephone  channel  J  Yudkin  has 
found  that  for  several  different  circuits  the  autocorrelation  function  of  the  channel  impulse  re¬ 
sponse  is  essentially  confined  to  an  interval  of  1  msec,  i.e.,  H^(t)  “  o  for  |t|  >  500 psec.  Thus, 
a  value  of  3*^  =  1  msec  should  prove  satisfactory  for  such  a  system.  Given  this  value  of  3*  ,  3" 
should  be  chosen  so  that  |  H( f)  |  2  is  approximately  constant  over  an  interval  of  1/3"  cps  and  also 
so  that  3*  »  3g.  However,  from  the  standpoint  of  equipment  complexity,  3"  should  be  as  small 
as  possible.  A  reasonable  compromise  is  3"  =  11msec.  With  these  parameters,  the  {©^(t)}  are 
sinusoids  spaced  at  100 -cps  intervals  throughout  the  telephone  line  passband.  To  proceed  fur¬ 
ther,  consider  a  telephone  line  having  a  nearly  flat  amplitude  characteristic  over  a  band  of 
2.5  kcps  and  assume  a  SNR  of  30  db.^  For  this  line,  approximately  50  of  the  {ckr .( t) }  would  be 
used.  If  the  situation  is  considered  in  which  the  modulation  system  is  used  without  coding,  it 
would  be  practical  to  assign  equal  energy  to  each  of  the  {a^(t)}  and  to  let  each  signal  carry  n 
binary  digits  of  information.  The  number  n  would  be  made  as  large  as  possible  without  causing 
an  excessive  error  probability  due  to  low  level  noise  and  intersymbol  interference.  Assuming 
negligible  intersymbol  interference,  the  value  of  n  can  be  determined  in  the  following  manner. 

By  assumption,  the  transmitted  signal  (for  one  interval  of  3*  seconds)  is  of  the  form 


x(t)  =  Yi  1 1 1  <  J/2 

i 

where  the  {x.}  are  statistically  independent  and  can  assume  the  equiprobable  values 


(132) 


A.  • 

=  ±1,  ±3,  .  .  . ,  ±2-  1  .  (133) 

The  constant  k  is  chosen  to  satisfy  the  input  power  constraint.  From  Eqs.  (131)  and  (132)  k 
must  satisfy 


s  =  E  [f  IT/2 x2(t)  dt] =  f  £  xixj  1t/2  ai(t)  “j(t)  dt 

___  rp  ^2  _  rp  j  ^ 

=  F  E  Xi2  IT/2  ai2(t)  dt  =  ? W-fj  £  xi2  [T/z  I1  +  <"1)1  cos2Wit]  dt 
,  „  — ^  r  .  sin co. 3 1 


_ 1  v  2 

-  3*  Li  xi 


(134) 


t  Since  it  is  possible  to  effectively  eliminate  phase  crawl  at  a  negligible  cost  in  signal  power,  its  effects  are  not 
considered  in  this  discussion. 

t  In  defining  the  SNR  it  is  assumed  that  the  actual  input  signal  power  is  scaled  to  compensate  for  any  flat  loss  on 
the  line.  For  example,  if  a  particular  line  has  a  minimum  loss  of  10  db  at  some  midband  frequency,  the  actual 
input  power  would  be  decreased  by  10  db  in  determining  the  SNR.  Also,  the  noise  power  is  considered  to  be  the 
average  power  of  the  line  output  signal  when  the  input  signal  is  removed.  (For  analytical  purposes,  it  is  assumed 
later  that  the  channel  noise  is  white  and  Gaussian  with  double-sided  noise  spectral  density  N  and  that  the  noise 
power  has  been  measured  at  the  output  of  a  2.5-kcps  ideal  filter.) 
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where  the  last  step  follows  from  the  fact  that  the  low-frequency  cutoff  of  the  telephone  line 
requires  >  27r(400)  and  therefore  >  8.87?.  From  Eq.  (133)  it  follows  that 

2n_  1 

x.2  =  k22"n+1  £  alii 

/=1 

or,  from  Series  25  of  Ref.  99, 

2  1,2  r~2n  . ,  ,,  . 

xi  =  j  k  [2  —  1 )  all  i 

Assuming  that  50  of  the  {a.(t)}  are  used,  it  follows  from  this  and  Eq.  (134)  that  k  must  satisfy 


S  =  k 


2  5  X  10“ 


[22n-l] 


(135) 


Next,  it  is  necessary  to  determine  the  signal  and  noise  components  at  the  receiver  output 

th 

when  the  transmitted  signal  is  given  by  Eq.  (132).  Since  the  kx  "coordinate  filter"  for  this 
system  is  realized  as  the  cascade  of  the  filter  h( —  t)  followed  by  multiplication  by  a,  (t)  and  in- 
tegration  over  the  interval  |t|  <  (3*  —  3^)/2,  it  follows  that  the  signal  portion  of  the  kxn  filter 
output,  say  s^,  is 


s 


k 


A?-?  J/2  frJ/2 

I  g  «w<t)  \ 

'-(J-J  )/2  k  [J-‘S/Z 


x(t)  Rh(t  -  t)  dT 


dt 


Upon  substituting  Eq.  (132)  and  recalling  that  zero  intersymbol  interference  is  assumed,  it 
follows  that 


iH(fk»i2«k 

where  f,  ^  [k  +  l/2(!T  —  3"  )  ],  if  k  is  odd  and  f,  A  k/2(3”  —  31)  if  k  is  even. 

K  ®  th  K  ® 

Similarly,  the  noise  portion  of  the  k  filter  output,  say  n^,  is 


(136) 


n 


k 


r(J“V/2 

)/ 2 

g 


n(r)  h(r  —  t)  dT 


dt 


and  since  n(t)  n(r)  =  NQ<5(t  —  r),  it  follows  that 


nknj 


=  N 


o 


r<?-3g>/2 
-  T  )/2 


%(t) 


M7-‘Jg)/  2 


a  .(t)  R,  (t  —  t)  dT  dt 
J  h 


“  N  |  H(f,  )  |  2  6.  . 
o'  k  1  kj 


(137) 


where  the  last  step  follows  from  the  assumption  of  zero  intersymbol  interference.  Thus,  if 
|H(f^)|^  «  1  is  assumed,  it  follows  from  Eq.  (133),  Eqs.  (135)  to  (137),  and  the  assumption  of 
Gaussian  noise  that  the  probability  of  error  is  given  by 
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where 


»  l~~p=  r  exp[-i(i: 

\l  Zncr 


)Z]  dt 


.  1  f^/a  f  1.2,  .. 

1  -  \  exp  [-  T  t  ]  dt 

4Zir 


H 


3  X  10  S 
5(22n-  1) 


-3„ll/2 


[N 


,-1/2 


.-3 


S  3  X  10  ,„2n 

N - 5 -  (2  -1) 

o 


•ll1/2 


Finally,  since  a  SNR  of  30  db  is  assumed  and  since  the  SNR  definition  is 

SNR  =  X  10'3 

o 

it  follows  that 


g|-  X  10"3  =  103 
O 


(138) 


or 

if-  =  5  X  106  . 

o 

Thus,  from  Eq.  (138), 

H.  =  [3  X  103(22n  -  l)-1]1/2 

-5 

and  it  follows  from  a  table  of  the  Gaussian  distribution  that  if  Pe  <  10  is  required,  n  must 
be  such  that 

[3  X  1 0 3 ( 22n  -  i)'1]1/2  >  4.42 
or 

1 

n  <  y  log2 

Thus,  n  ^  3  and  the  conclusion  is  reached  that  it  should  be  possible  to  transmit  three  binary 
digits  of  information  on  each  of  the  {a^(t)}  without  exceeding  a  of  10 ”  .  Since  the  {o^(t)} 
are  11  msec  long  and  fifty  of  them  are  used,  this  implies  a  data  rate  of 

R  =  3  X  50  X  (0. Oil)"1  =  13,600  bits/second  (139) 

When  this  result  is  compared  with  the  rates  of  2500  to  5000  bits/second  attainable  with  commer¬ 
cial  equipment  (see  Sec.  I-C),  it  is  clear  that  the  present  modulation  system  offers  a  significant 
potential  improvement  in  performance.  Furthermore,  use  of  a  "powerful”  coder-decoder  such 
as  the  SECO  machine^  would  lead  to  a  10  to  20  percent  rate  increase  while  giving  virtually 
error-free  transmission. 
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Fig.  16.  Amplitude  characteristic  of  simulated  channel. 
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As  a  practical  matter,  however,  it  must  be  emphasized  that  this  result  is  based  on  several 
assumed  conditions  that  may  be  only  approximately  realized  in  practice  Most  important  of 
these  is  the  assumption  of  negligible  intersymbol  interference.  (Recall  from  Sec.  I-D  that  inter¬ 
symbol  interference  is  the  primary  factor  limiting  the  rate  of  present  modems.)  Since  R^(t)  is 

not  identically  zero  for  |t|  >  500  psec,  it  is  clear  that  some  intersymbol  interference  must  be 

34 

present  in  an  actual  system;  however,  the  form  of  Rj/t)  found  by  Yudkin  indicates  that  a  sig¬ 
nificant  reduction  will  be  obtained.  In  view  of  the  potential  rate  improvement  with  this  modula¬ 
tion  system  and  because  of  the  impracticality  of  analytically  determining  the  actual  intersymbol 
interference  level  as  well  as  the  effects  of  other  departures  from  assumed  conditions/  an  experi¬ 
mental  program  was  undertaken  to  demonstrate  that  the  theoretical  rate  improvement  could  be 
realized  in  practice.  The  results  of  this  program  are  presented  in  the  following  section. 

As  a  final  evaluation  of  the  telephone  line  modulation  system,  it  is  of  interest  to  compare  the 
value  of  C  for  this  modulation  system,  say  Cj,  to  the  value  obtained  with  the  optimum  modula¬ 
tion  system  for  a  channel  whose  amplitude  characteristics  are  similar  to  those  of  a  telephone 
line.  Figure  16  presents  the  amplitude  characteristic  of  a  simulated  channel  used  in  the  exper¬ 
imental  work.  Assuming  white  noise,  the  value  of  C  for  this  channel  is  presented  in  Fig.  17  and 
has  been  obtained  by  graphical  integration  of  Eq.  (38).  Assuming  negligible  intersymbol  inter¬ 
ference,  it  is  readily  found  from  the  previous  discussion  and  Eq.  (35)  that. 


C 


log2 


Bj(0) 


2 


bits/second 


(140) 


where  B  (0)  is  chosen  to  satisfy 


and 


1 

By(0) 


Equation  (140)  is  plotted  in  Fig.  17  for  the  channel  of  Fig.  16  using  the  values  J  =  11  msec  and 
3 g  =  1  msec.  Observe  that  the  effective  loss  in  signal  power  is  only  3  db  at  S/2  =  3  X  10^  which 
corresponds  to  a  SNR  of  approximately  30  db. 
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CHAPTER  V 

EXPERIMENTAL  PROGRAM 


The  experimental  program  described  in  this  section  was  undertaken  to  demonstrate  that  the 
performance  predicted  for  the  telephone  line  modulation  (TELMOD)  system  in  Sec.IV-C-2  could 
be  realized  in  practice.  As  described  previously,  the  transmitter  portion  of  the  TELMOD  sys¬ 
tem  consists  of  approximately  fifty  different  11 -msec  duration  sine  and  cosine  signals  spaced 
at  100-cps  intervals  throughout  the  telephone  line  passband.  The  receiver  portion  consists  of 
filtering  by  h( —  t)  followed  by  multiplication  by  the  desired  sine  or  cosine  function  and  integration 
over  the  inner  interval  of  10  msec.  In  other  words,  with  {a.(t)}  and  x(t)  defined  as  in  Eqs.  (131) 
and  (132),  respectively,  this  system  has  the  form  indicated  in  Fig.  18  for  the  interval  |t|  <  3/2. 
Recall  that  the  fundamental  assumption  of  the  previous  analysis  was  that  intersymbol  interference 
for  all  signals  was  negligible  relative  to  the  assumed  noise  level  of  30 -db  SNR.  Thus,  the  pri¬ 
mary  goal  of  the  experimental  program  was  to  demonstrate  that  this  intersymbol  interference 
level  could  be  achieved  in  practice;  a  secondary  goal,  of  course,  was  to  demonstrate  that  other 
differences  between  the  model  and  the  reed  channel  have  a  negligible  effect  on  the  predicted 
performance. 

When  considering  the  intersymbol  interference  level  to  be  expected  with  the  TELMOD  system, 
two  points  should  be  noted.  First,  observe  that  if  a^(t)  and  or ^ (t)  differ  in  frequency  by  several 
hundred  cycles  per  second,  their  spectra  have  virtually  zero  overlap.  Thus,  it  would  be  ex¬ 
pected  that  the  orthogonality  of  these  signals  would  be  quite  good  at  the  channel  output,  i.e.,  the 
intersymbol  interference  at  the  output  of  a  particular  co-ordinate  filter  should  be  caused  pri¬ 
marily  by  a  few  immediately  adjacent  tones.  Second,  recall  from  Sec.IV-C-1  that  an  eigen¬ 
function  is  closely  approximated  by  a  sinusoid  only  if  |  H(f)  |  is  essentially  constant  over  an 
interval  several  times  l/J  cps  wide  centered  about  the  frequency  of  the  sinusoid.  Thus,  it  is 
to  be  expected  that  the  orthogonality  of  immediately  adjacent  tones  will  be  good  (the  intersymbol 
interference  will  be  small)  for  tones  in  midband  where  |  H(f)  |  is  nearly  constant  while  the 
orthogonality  will  be  poor  near  the  band  edges. 

Based  upon  these  considerations,  an  experiment  was  undertaken,  which  involved  construction 
of  transmitting  and  receiving  equipment  using  ten  of  the  {a^(t)}  discussed  above  with  J  =  11  msec 
and  T  =  1  msec.  The  amplitude  of  each  a^(t)  is  determined  by  a  five -digit  binary  number;  this 
number  is  obtained  from  either  .manual  switches  or  a  random  source.  The  frequencies  of  the 
{a^(t)}  were  variable  and  covered  with  ranges  of  0.5  to  0.9kcps,  1.5  to  1.9kcps  and  2.5  to  2.9kcps 
with  both  sine  and  cosine  signals  being  used  at  each  frequency.  The  block  diagram  for  this  sys¬ 
tem  is  shown  in  Fig.  19. 

Using  this  equipment,  the  following  experiments  were  performed. 

A.  SIMULATED  CHANNEL  TESTS 

On  a  simulated  channel  with  white  Gaussian  noise  and  the  filter  amplitude  characteristic 
presented  in  Fig.  (16),  the  zero  rate  error  exponent  was  evaluated  by  measuring  the  probability 
of  error  for  two  orthogonal  signals.  The  measured  performance  was  0.8  to  1.0  db  from  theo¬ 
retical  which  is  well  within  the  loss  attributable  to  measurement  inaccuracies  and  slight  equip¬ 
ment  imperfections.  The  intersymbol  interference  level  for  various  tones  was  investigated  by 
measuring  the  error  rate  due  to  intersymbol  interference  in  the  absence  of  noise.  Using  the 
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Fig.  18.  Block  diagram  of  TELMOD  system. 
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Fig.  19.  Block  diagram  of  experimental  equipment. 
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approximation  that  the  intersymbol  interference  was  Gaussianly  distributed!  it  was  possible  to 
calculate  an  effective  inter  symbol  interference  variance  for  various  tones  in  the  frequency  bands 
of  0.5  to  0.9,  1.5  to  1.9  and  2.5  to  2.9kcps.  The  results  of  these  measurements  can  be  sum¬ 
marized  as  follows.! 

(1)  In  the  region  of  0.6  to  2.0kcps,  where  |H(f)|2  is  flat,  the  intersymbol 
interference  variance  between  a  particular  reference  tone  and  adjacent 
tones  decreased  approximately  as  l/n,  where  n  is  the  frequency  dif¬ 
ference  between  the  tones  in  units  of  100  cps.  For  n  >  6,  the  intersymbol 
interference  variance  caused  by  a  single  tone  was  too  small  to  be 
measured. 

(2)  In  the  frequency  range  of  0.9  to  1.9kcps,  the  total  intersymbol  inter¬ 
ference  variance^  was  less  than  6.25  x  10“^#  For  16-level  output 
quantization,  this  implies  that  Pe  <  3  X  10~5 #  Thus,  to  within  the  ac¬ 
curacy  of  the  Gaussian  approximation  to  the  inter  symbol  interference 
distribution,  it  would  be  possible  to  transmit  four  binary  digits  of  in¬ 
formation  on  each  of  the  tones. 

(3)  For  frequencies  in  the  range  of  0.5  to  0.9kcps,  the  total  intersymbol 
interference  variance  increased  from  a  level  of  less  than  6.25  x  10”^ 
at  0.9kcps,  to  0.2  to  0.7kcps,  to  1.0  at  0.5kcps.  The  corresponding 
information  rates  at  a  Pe  <  3  X  10“5  would  be  4,  3,  and  2  binary  digits 
per  tone  for  the  frequencies  of  0.9,  0.7,  and  0.5kcps,  respectively. 

(4)  In  the  1.9-  to  3.0-kcps  frequency  range,  the  intersymbol  interference 
variance  increased  from  6.25  X  10“2  at  1.9kcps  to  0.16  at  2.5kcps, 

to  0.5  at  2.7kcps,  to  1.25  at  2.9kcps.  The  corresponding  information 
rate  at  a  Pe  $  3  X  10“5  would  be  3,  2,  and  1  binary  digits  per  tone  for 
the  frequencies  of  2.5,  2.7,  and  2.9kcps,  respectively. 

The  significance  of  the  measurements  on  the  simulated  channel  can  be  summarized  as  follows. 

(1)  As  predicted  theoretically,  the  inter  symbol  interference  level  is  small 
for  signals  in  the  frequency  range  where  |H(f)|2  is  constant  (in  the  0.9- 
to  1.9-kcps  range)  while  it  is  significantly  large  near  either  band  edge, 
where  |H(f)p  changes  considerably  over  an  interval  of  a  few  hundred 
cycles  per  second. 


f  The  justification  for  this  assumption  is  the  fact  that  the  intersymbol  interference  for  a  given  tone  is  a  sum  of  the 
(small)  random  interferences  from  a  number  of  adjacent  tones.  Thus,  to  a  first  approximation  it  would  be  expected 
from  the  Central  Limit  Theorem  that  the  total  interference  would  be  Gaussianly  distributed.  Clearly,  the  fact 
that  only  10  to  12  tones  contribute  significantly  to  the  intersymbol  interference  for  a  given  tone  implies  that  this 
is  a  highly  approximate  assumption.  However,  it  is  a  convenient  engineering  approximation  which  led  to  con¬ 
sistent  results. 

$  All  the  results  presented  in  this  section  are  based  on  an  extrapolation  of  measurements  made  in  the  indicated 
frequency  bands.  For  example,  the  intersymbol  interference  variance  for  a  tone  at  0.9  keps  was  obtained  by 
measuring  the  intersymbol  interference  variance  due  to  the  signals  in  the  0.5-  to  0.8-kcps  band  and  adding  this 
to  the  variance  obtained  for  the  0.9-kcps  tone  when  aU  signals  were  transmitted  with  random  amplitude. 

§  The  intersymbol  interference  variances  given  in  this  section  are  expressed  in  normalized  form,  the  reference 
being  taken  as  the  amount  by  which  an  integrator  output  changes  when  the  amplitude  of  the  corresponding  input 
signal  is  changed  by  one  level.  Thus,  for  a  given  intersymbol  interference  variance  the  probability  of  error, 
assuming  32-level  quantization  of  the  integrator  output,  is 

Pe  =  1  -  (2*o2)',/2  X|/2  exp  [-1  (V°)21  dx  . 

Similarly,  if  16-level  quantization  is  assumed 

Pe  =  1  -  (2ito2)  '^2  f} j  exp  [-y  (x/ c)2]  dx 
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(2)  On  the  basis  of  midband  measurements,  the  intersymbol  interference 
level  for  the  TELMOD  system  on  this  channel  is  equivalent  to  a  noise 
level  of  approximately  31-db  SNR.t  Thus,  for  a  noise  SNR  less  than 
about  30  db,  the  intersymbol  interference  level  could  be  considered 
negligible,  and  furthermore,  the  performance  would  be  within  1  db  of 
theory.  Conversely,  for  a  noise  SNR  significantly  greater  than  30 -db 
intersymbol  interference  would  be  the  primary  factor  in  determining 
probability  of  error. 

(3)  If  a  noise  SNR  considerably  greater  than  30  db  is  assumed  and  if  a 
Pe  <  3  X  10"^  is  desired,  it  would  be  possible  to  achieve  a  data  rate 
of  approximately  14,000  bits/second  over  this  channel.  This  number 

is  based  upon  the  measurements  given  above  and  assumes  the  following 
relation  between  the  tones  and  the  quantization  levels. 


Frequency 

(kcps) 

0.5  ,  0.6 
0.7  ,  0.8 
0.9  to  1.9 
2.0  to  2.4 
2.5  to  2.7 
2.8  to  3.0 


Binary 

Digits/Tone 

2 

3 

4 
3 

1 


Furthermore,  if  the  noise  SNR  were  sufficiently  good,  it  would  be 
possible  to  do  amplitude  equalization  prior  to  the  h(— t)  filter  and  thus 
to  extend  the  flat  portion  of  the  channel  to  3.0  kcps!  Such  equalization 
would  lead  to  a  rate  of  15,000  bits/second  and  would  greatly  simplify 
instrumentation  problems. 


B.  DIAL-UP  CIRCUIT  TESTS 

Intersymbol  interference  measurements  were  made  on  a  local  dial-up  line  whose  amplitude 
characteristics  are  given  in  Fig.  20.  For  this  channel,  it  was  found  that  the  severe  amplitude 
ripple  gave  an  unusually  long  autocorrelation  function  R^(t)  of  approximately  10 -msec  duration. 
Because  of  this,  the  observed  intersymbol  interference  levels  were  large  and  the  assumption 
of  a  Gaussian  distribution  for  the  intersymbol  interference  could  not  be  made.  Thus,  the  only 
measurements  possible  for  this  channel  were  probability  of  error  vs  number  of  amplitude  levels. 
The  results  of  these  measurements  indicate  that  only  two  levels  per  tone  (or  equivalently  about 
1800  bits/second)  can  be  transmitted  at  an  acceptable  error  rate  using  the  TELMOD  system 
presented  in  Fig.  18.  However,  it  was  found  that  the  low-level  noise  SNR  for  this  line  was  better 
than  50db.§  Thus,  amplitude  equalization  of  the  channel  before  filtering  by  h(— t)  could  be  done 


t  From  Eq.(138)  and  the  subsequent  discussion,  it  follows  that  if  n  is  the  number  of  binary  digits  that  can  be 
transmitted  at  a  P0  <  3X  10"5  due  to  intersymbol  interference,  the  effective  intersymbol  interference  SNR  is 
given  by 


SNR  X 


22n  -  U 


1/2 


4.0 


i  As  a  practical  matter,  it  has  been  found  during  the  experimental  work  that  the  SNR  for  many  leased  data  circuits 
is  on  the  order  of  50  db  rather  than  the  frequently  quoted  30  db.  Thus,  such  a  procedure  of  amplitude  equaliza¬ 
tion,  although  not  optimum  from  the  standpoint  of  detection  theory,  would  lead  to  significantly  reduced  inter¬ 
symbol  interference  and  thus  to  an  increased  rate. 


§  It  must  be  emphasized  that  this  SNR  was  observed  only  in  the  absence  of  impulse  noise.  For  this  line,  the 
impulse  noise  was  both  large  and  frequently  occurring;  the  observed  probability  of  error  due  only  to  impulse  noise 
was  on  the  order  of  1 0“3 . 
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Fig.  20.  Amplitude  characteristic  of  local  dial-up  line. 

to  give  a  flat  response  over  the  band  of  approximately  0.6  to  1.6kcps.  On  the  basis  of  the  tests 
on  the  simulated  channel,  this  would  give  a  capability  of  transmitting  at  least  eight  levels  on 
each  tone  and  thus  of  realizing  a  data  rate  of  approximately  6000  bits/second. 

C.  SCHEDULE  4  DATA  CIRCUIT  TESTS 

The  third  series  of  tests  were  made  on  a  Schedule  4  Data  Circuit  looped  from  Lincoln  Labora¬ 
tory  via  Springfield,  Massachusetts,  whose  amplitude  characteristics  are  given  in  Fig.  21.  As 
with  the  simulated  channel,  the  intersymbol  interference  level  for  various  tones  was  investigated 
by  measuring  probability  of  error  due  to  inter  symbol  interference  and  then  calculating  an  ef¬ 
fective  inter  symbol  interference  variance  assuming  a  Gaussian  distribution  .t  The  results  of 
these  measurements  can  be  summarized  as  follows. 

(1)  As  before,  the  inter  symbol  interference  between  a  reference  tone 
and  adjacent  tones  was  found  to  decrease  approximately  as  l/n  where 
the  frequency  difference  between  tones  is  n  X  100  cps.  For  n  >  6,  the 
intersymbol  interference  due  to  a  single  tone  was  too  small  to  be 
measured. 

(2)  In  the  frequency  range  of  0.5  to  0.9kcps,  the  intersymbol  interference 
variance  was  0.74  at-0.5kcps  and  1.56  at  0.7  and  O^kcps.t  The  cor¬ 
responding  data  rate  at  a  Pe  <3  X  10  is  2  binary  digits  per  tone  at 
0.5kcps  and  1  binary  digit  per  tone  at  0.7  and  0.9kcps. 

(3)  For  the  frequency  range  1.5  to  1.9kcps,  the  measured  variances  were 
0.12  at  1.5kcps,  0.15  at  1.7kcps  and  0.21  at  1.9kcps.  The  correspond¬ 
ing  data  rate  would  be  3  binary  digits  per  tone. 


t  For  this  line,  the  low-level  noise  SNR  was  50  db  and  impulse  noise  activity  was  extremely  small;  the  measured 
probability  of  error  due  only  to  impulse  noise  was  on  the  order  of  10“^.  Thus,  the  effects  of  both  low-level  noise 
and  impulse  noise  were  neglected  in  the  intersymbol  interference  measurements. 

i  Observe  here  that  the  intersvmbol  interference  increases  for  tones  further  away  from  the  band  edge.  The  cause 
of  this  is  the  rapid  change  in  (H(f)|^  around  1.0  kcps. 
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Fig.  21.  Amplitude  characteristic  of  Schedule  4  Data  Grcuit. 

(4)  In  the  2.5-  to  2.9-kcps  frequency  range,  the  variances  increased  from 
1.32  at  2.5kcps  to  2.1  at  2.7kcps,  to  3.3  at  2.9kcps.  The  corresponding 
data  rate  at  a  Pe  <  3  X  10“1 2 3 * 5  is  1  binary  digit  per  tone. 

The  significance  of  these  measurements  can  be  summarized  as  follows. 


(1)  As  predicted  theoretically,  the  fact  that  this  channel  is  not  flat  causes 
the  intersymbol  interference  level  for  the  TELMOD  system  to  be  sig¬ 
nificantly  higher  than  it  w'as  on  the  simulated  channel.  However,  the 
high  observed  SNR  for  low-level  noise  would  allow  amplitude  equali¬ 
zation  of  the  channel  prior  to  filtering  by  h(— t)  and  thus  would  allow 

a  significant  reduction  in  the  inter  symbol  interference  level. 

(2)  If  a  Pe  <  3  X  10 “5  is  desired,  it  would  be  possible  without  amplitude 
equalization  of  the  line  to  achieve  a  data  rate  of  approximately  8400  bits/ 
second.  This  is  based  upon  the  measurements  given  above  and  assumes 
the  following  relation  between  the  tones  and  the  quantization  levels. 


Frequency 

(kcps) 

0.5  to  1.0 
1.1  to  1.3 

1.4  to  1.9 
2.0  to  2.4 

2.5  to  3.0 


Binary 

Digits/Tone 

1 

2 

3 

2 

1 


If,  however,  amplitude  equalization  of  the  line  was  performed,  the  tests 
on  the  simulated  channel  indicate  that  a  rate  of  approximately  15,  000 
bits/second  would  be  possible. 
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APPENDIX  A 

PROOF  THAT  THE  KERNELS  OF  THEOREMS  1  TO  3  ARE 
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A  kernel  K(t,  s)  is  said  to  be  an  £7 -kernel  if 


T  X 

|  K 1 1  2  ^  C  f  K2(t,  s)  dtds  <  « 

Jo  Jq 


For  the  kernel  of  Theorem  1,  it  follows  that 

.2 


K 


f  f  if* 


h(a  —  t)  h(a  —  s)  da 


dt  ds 


and  thus  from  the  Schwarz  inequality 


*'k**2<:  yT  [jTi  h2(<T_t)d<T]  [yTi  h2^-s) 

•iff1'"-" 

•  If  C 


da 


dtds 


-(a)  da  dt 


=  T 


r 


(t)  dt 


<  °0 


which  proves  that  the  kernel  is  Next,  to  prove  that  the  kernel  of  Theorem  2  is 

function  K^y2(t)  by 


Kl/2(t)  4  I  H(f)  |  [N(f)]"l/Z  eja,t  df 


Then 


and,  therefore 


K(t 


-  s)  =  J  Ki/2(t  _  Ki/2(s  “  CT>  d(T 


K 


= r  c  [r„ 


dt  ds 


Kl/2^  “  ^  Kl/2^S  ”  ^  d(J 
which,  from  the  Schwarz  inequality,  becomes 

||k||2;<:  [l  Kl2/2(t  “  CT)  d(T]  [  J  K2/2(s  -  (7)  dff  dtds 


■  [£ *  f 


K^ta)  da 
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define  a 


Thus,  from  Parseval's  theorem. 


|K||2^ 


K 


|H(f)  |  2  [N(f)]_1  df 


<  « 


and  the  kernel  is  Finally,  to  prove  that  the  kernel  of  Theorem  3  is  £  ^  define  a  function 

kl/2(t'  s)  by 

yi^T 


el/2(t,  s)  k  l  - jzr1  yi 

i=l  'sri 


(S) 


where 


T 

(h  y  )  4  f  1  y.(<t)  h(<T  —  t)  dcr 

1  ^O 


and  y.(t)  and  /T  are  defined  as  in  the  proof  of  Theorem  3.  Then 


T 

J  1  ki/2^’  kl/2^S’  d<T  =  E 


(ht’  Yi)T1  (hs’  Yj)T1 

M 


(y,*>  yJm 
1  J  Ai 


(hfj  y.)T  (h^,  yi^T 

^  1 


.  p  fT* 

Jo  Jo 


jO 

=  K(t,  S) 


v  ri<or)  r^p) 

E  T, 


i=l 


h(p  —  s)  da  dp 


Thus, 


l|K|| 


:= r  r  if* 

^o  ^o  L^o 


kl/2(t'  kl/2(S'  ^  dtr 


dt  ds 


which  becomes,  from  the.  Schwarz  inequality, 
IlK 


l|2«I0T£T  [£Tlkl2/2(,'",d"]  [£T‘k12/2<S'"ldt' 

ii  .2  1 2 

«  (ht.y.)T  1 

I  2  ‘d*  • 

JO  i=l  1  j 


dt  ds 
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However,  it  is  known  that 

(ht,  Yi)2 


i=i  1  J-a 


iH(f)r 

N(f) 


df 
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Thus 


\K\\Z< 


r 

-<x 


I  H(f)  | 


N(f) 


<  °o 


and  the  kernel  is 
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APPENDIX  B 

DERIVATION  OF  ASYMPTOTIC  FORM  OF  ERROR  EXPONENTS 


This  Appendix  is  concerned  with  a  derivation  of  the  limiting  form  of  the  error  exponents  of 
Eqs.  (34)  to  (36)  and  Eq.  (58)  for  T  -*•  « .  Consider  first  the  standard  random  coding  exponent 
given  by 


and 


where 


et(p)  = 

(  p  )2 
1  +  p' 

2  Brp(p) 

R(i)<<:  R<:  R(0) 

R(P)  = 

N 

2T  LJ 
i=l 

X.  ET(p) 

bt(p)  p 

04  P4  i 

Et(R)  = 

N 

—  ? 
2T  Li 

i=l 

Bt(1)  r 

o^:  FU<  R(l) 

ST 

2  x-1 

1 

l  +  p  .  ,  l 

r  ^i=l 

(B-l) 


bt(p) 


N 


and  N  is  chosen  to  satisfy  X^.  >  B,p(p)  ^  Recall  from  Sec.IV-C-1  that  for  large  T  the 

{X.}  are  given  by 

i  ,\2 

i  =  1,2,  3,  .  .  .  (B-2) 


X2i  "  X2i-1 


N(^) 


and  assume  temporarily  that  |  H(f)  |  /N(f)  is  a  continuous  monotone  nonincreasing  function  for 
f  ^0.  Also,  recall  from  Fig.  5(a),  or  observe  from  Eq.  (B-l),  that  B^(p)  is  chosen  to  satisfy 


N/2 

S _  _  J_  y 

+  p)  T 

i=  1 


1 


N(^) 


bt(p)  |H(^)  1 2 


2(1  +  p) 

where  now  N/2  is  the  largest  integer  such  that 

,  N 
2T 


(B-3) 


|h<4o|2 


>B  (p)  . 

N(^) 

Observe,  as  suggested  in  Fig.  B-l,  that  the  quantity 

N/2  n(  — ) 

_1  y  m  T ' 

T  Li 


T  i=i  |H(^)|2 


(B-4) 


is  an  approximation  to  the  area  under  the  curve  N(f)/|H(f)|  ,  while  the  quantity 
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Fig.  B-l.  Concerning  interpretation 
of  Eq.  (B-3). 


N  1 

2T  ’  B T(p) 


is  precisely  the  area  under  the  line  [B ^(p)]”1.  Viewed  in  this  manner,  it  is  clear  that  for  arbi¬ 
trarily  large  T,  B-^p)  will  always  be  adjusted  to  make  the  difference  in  these  two  areas  equal 

to  S/2(l  +  p).  Thus,  since  N(f)/ 1  H(f)  |  is  continuous  and  therefore  Rieman  integrable  over  a 
69  97 

finite  interval,  and  since  it  is  readily  shown  from  the  definition  of  a  Rieman  integral  that 
when  N  =  2TW(W  >  0) 


lim 

T-°o 


n4>  r"  n(q  jt 

|H(^||2  Jo  |H(t)|2 


it  follows  that  in  the  limit  T  -*  «,  B^,(p)  or,  more  simply,  B(p)  must  be  chosen  to  satisfy 

VW 


s  _  rw  [_l_  _  N(f>  ] 

2(1  +  p)  ”  Jo  [B(p)  |H(f)|Zj 


df 


where  W  is  defined  by 


iHgOil  =  B(p) 
N(W)  a{p)  * 


From  this  discussion  and  Eq.  (B-l)  it  follows  that 

x2  S 


E(p)  4  lim  E  (p)  =  (-j )Z  |  B(p)  04  P4  1 

T— »  1  i  +  p 


(B-5) 


(B-6) 


(B-7) 


where  B(p)  is  given  by  Eq.  (B-6).  Similarly,  from  Eqs.  (B-l),  (B-2),  and  (B-5)  if  follows  that 

N/2  ■  ■’  et(p>1 


lim  R(p)  =  lim 


lim 

T-*-«o 


T  2 


i=l 


'  N/2 

1  V  ln  |H(i/T)r  N  . 

T  Li  in  TSlli/T)  2T  in±MP' 


i=l 


E (p) 
P 


0<C  p<£  1 


(B-8) 


From  Fig.  B-l  it  is  clear  that  for  T  -*•  «,  N/2T  -*•  W  where  W  is  defined  by  Eq.  (B-6).  Further¬ 
more,  since  ln(  - )  is  a  continuous  function  of  its  argument,  it  follows  from  above  that 
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N/2 


lim  A  V  in  lH(i/T)r  =  fW  ln  Lh.(|) Jl  df 
iim  T  ^  m  N(i/T)  \  in  N(f)  at 


T-*  oo 

1  i=l 

Thus,  from  Eq.  (B-8), 

lim  R(p)  =  C 
T— Jr 


w  iron2 


■r 

Jo 


o 

w 


^  di  -W  InB(p) 


In 


I  H(f)  |  2 

N(f)  B(p) 


df  - 


E(p) 


o^P<:i 


(B-9) 


Equations  (B-7)  and  (B-9)  are  the  desired  asymptotic  forms  when  |  H(f)  |  /N(f)  is  monotonic. 
When  |  H(f)  |  /N(f)  is  nonmonotonic  an  analogous  derivation  shows  that  for  T  -*  B(p)  is  chosen 
so  that 


2(1 


3_  =  f  [  1  _  N(f)  1 

+  P)  Jw  lB<P)  |  H(f)  |  2J 


where 


W  = 


df 


(B-10) 


The  corresponding  values  of  E(p)  and  R(p)  are  given  by 

E(p)  =  (T-+^)2  I  B(P> 

and 

ln-LSlil'2 

.  J-n  isrm  T3 


which  is  the  result  presented  in  Eqs.  (37)  and  (38). 

When  0  R<  R(l),  it  follows  from  Eq.  (B-l)  and  the  previous  discussion  that 


N/2 

lim  E„(R)  =  lim  ^  Y 
T— =o  . 


In 


|H(i/T)|2 

N(i/T)  Bt(1) 


-  R 


In 


I  H(f)  I  2 

N(f)  B(l) 


df  -  R 


where  B(l)  and  W  are  defined  by  Eq.  (B-10).  This  is  the  result  presented  in  Eq.  (39). 

Finally,  the  similarity  of  the  expurgated  bound  given  in  Eq.  (58)  to  the  standard  bound  given 
in  Eq.  (B-l)  together  with  the  above  results  leads  directly  to  the  asymptotic  form  of  the  ex¬ 
purgated  bound  presented  in  Eq.  (59). 
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APPENDIX  C 

DERIVATION  OF  EQUATION  (63) 


This  Appendix  proves  that  the  function  y(t)  which  maximizes  the  functional 

N 


I  =  C  — Xx2(<7)  +  C  x(ff)  x(p) 

Jq 


Rh(ff-p)+  E  2/3kRh  (o-P-W) 

k=l 


dp 


da  (C-l) 


is  a  solution  of  the  integral  equation 

,3 


where 


Ay(t)  =  C  y(T)  K(t  —  r)  dT  0<:t^J 
^o 


N 


(C-Z) 


K(t  -  r)  A  Rh(t  _  T)  +  £  [\(t-s  +  kfT)  +  Rh(t-s-kJ)]  . 

k=i 

Following  standard  variational  techniques,^' let  x(t)  =  y(t)  +  €  f(t),  where  y(t)  is  the  desired 
solution,  and  set 


dl 

de 


£  =  0 


=  0 


From  Eq.  (C-l),  this  gives 

Jo 


dl 

de 


=  f5 

f=0  ) 

'-'n 


r 


-2 \y(a)  f (a)  +  \  [y(<r)  f(p)  +  f (a)  y(p)] 

o 


N 


Rh(ff  -  p>  +  E  2(3kRh^ff  ~ p  ~ k^> 

k=l 


dp  }  da 


(C-3) 


However,  since  R^(a  —  p)  =  R^(p  ~  <*)>  it  follows  that 

^o 

and 

r3  N  r$  n3 


P^T  p!T  p p J 

\  \  y(a)  f(p)  RhCa  -  p)  da  dp  =  \  \  f(a)  y(p)  Rh(a  -  p)  da  dp 

J0  J0  Jq  Jq 


N 

f  f  r(ff)  f(p)  E  ^kRh*CT  ~  p  ~  k^)d<rdP  =  f  f  y(p)  f(<0  E 

^0*^0  .  .  Jo  ^o  . 

k=l 


N 

E 

k=  1 


x  P  +  k  J)  da  dp 


Thus,  Eq.  (C-3)  can  be  written  as 


dl  I  =  2 

de  |e=0 


C  -Xy(ff)  +  f 
Jr 


-\y(a)  +  \  y(p)  K(a  -  p)  dp 


f(a)  da 


(C-4) 
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where  K(cr  —  p)  is  defined  in  Eq.  (C-2) 


Finally,  setting 


_dl 

dc 


=  0 


for  all  f(t)  leads  to  the  result  that  the  bracketed  quantity  in  the  integrand  of  Eq.  (C-4)  must  be 
zero  for  all  0  ^  a  ^  J,  i.e.,  y(t)  must  be  a  solution  of  Eq.  (C-2). 
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APPENDIX  D 

DERIVATION  OF  EQUATION  (70) 


This  Appendix  proves  that  if  Q^fp)  is  a  polynomial  differential  operator  of  order  n,  if  v(t) 
has  a  continuous  (n  —  l)St  derivative  and  satisfies  the  boundary  conditions  of  Eq.  (68),  and  if  K(t) 
is  n  times  differentiable,  then 

C  [Q  (p)  v(cr) ]  K(t  -  a)  da  =  f  v(a)  [Q  (— p)  K(t  -  a)]  da  .  (D-l) 

^o  ^o 

The  definition  of  Q^fp)  implies  that 


where 


and 


Thus 


QCT(p)  v(<t)  =  Yj  O^CT^J 

i=0 


(i) ,  x  a  d*v 
Vv  '(a)  £  — T 

da 


Q(s)  ^  Y  bisl 

i=0 


L  [Q° 


(p)  v(cr)  ]  K(t  -a)  da 


E  bi  J  v 


5  (i) 


(a)  K(t  -  a)  da 


i=0 


f  v»> 


^0  v(i'V>^K(t-a)  da 


For  i  >  0,  integration  by  parts  and  substitution  of  the  boundary  conditions  of  Eq.  (68)  shows  that 

a=3“ 

vli/(a)  K(t  —  a)  da  =  K(t  —  a)  vu  A;(a) 

^o 

Jo 

Repeated  application  of  this  result  leads  to 


=  _  fJ  v(i_1)(a)  ^  K(t  —  a)  da 


I 


j  J  i 

v(l)(a)  K(t  -  a)  da  =  (-1)1  C  v(a)  K(t  -  a)  da 
o  ^o  da1 


Thus, 


£  bi  y v 


J  (i) 


(a)  K(t  —  a)  da  =  \  v(a) 


i=0 


=r 


Y  V-D1  -^7  K(t-a) 


i=0 


da 


da 


c7 

=  \  v(a)  [Q  (— p)  K(t  -  a]  da 


and  Eq.  (D-l)  is  proved. 
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APPENDIX  E 

PROOF  OF  EVEN-ODD  PROPERTY  OF  NONDEGENERATE  EIGENFUNCTIONS 


It  is  desired  to  prove  that  all  nondegenerate  eigenfunctions  of  the  eigenvalue  differential 
equation 


[N(p2)  -  X.D(p2)]  v.(t)  =  0  -f  <  t  <  | 


(E-l) 


v4> 


0 


vi(n_1)(±f  )  =  0  (E-2) 

2  2 

are  either  even  or  odd  functions.  Observe  first  that  since  N(p  )  and  D(p  )  contain  only  even 
order  derivatives  and  since 


j2k  _,2k 

- Tk  vi(t)  =  ~ "2k  vi(_t) 
dt^K  1  dt:  1 


k  =  0,  1,  . .  . 


(E-3) 


it  follows  that  if  v^(t)  is  an  eigenfunction  with  eigenvalue  X ^  then  v^(—  t)  is  also  an  eigenfunction 
with  eigenvalue  X..  Thus,  if  v^(t)  is  a  nondegenerate  eigenfunction,  there  must  exist  a  constant 
b  such  that 

v.(t)  +  bv^-t)  =  0  .  (E-4) 


However,  if  v.  (t)  and  v^Q(t)  denote  the  even  and  odd  parts  of  v^(t),  respectively,  that  is, 

v .  (t )  +  v.(-t) 

vie«  =  2 


and 

v.(t)  -  v.(-t) 

v.  (t)  =  — - - 

io'  '  2 

it  follows  from  Eq.  (E-4)  that  (1  —  b)  vig(t)  =  —(1  +  b)  v^Q(t)  and  therefore  that  either  b  =  1  and 
v^Q(t)  =  0  or  b  =  —  1  and  v-e(t)  =  0.  In  other  words,  a  nondegenerate  eigenfunction  is  either  even 
or  odd. 
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APPENDIX  F 

OPTIMUM  TIME -LIMITED  SIGNALS  FOR  COLORED  NOISE 


This  Appendix  presents  a  generalization  of  the  results  of  Sec.  IV- A  to  channels  with  colored 
noise.  Since  a  derivation  of  the  results  for  colored  noise  follows  the  previous  work  quite  closely, 
only  the  final  results  are  presented  here. 

The  problem  considered  is  that  of  choosing  a  fixed  energy,  time-limited  (to  [0,  J])  signal  to 

54 

give  at  the  optimum  detector  output  both  zero  inter  symbol  interference  at  t  =  kT,  k  =  ±  1, 

±2,  .  .  . ,  and  a  maximum  SNR  at  t  =  0.  Under  the  restrictions  that  h(t)  is  a  lumped  parameter 
system  and  N(f)  is  a  rational  spectrum,  the  following  result  is  obtained.  As  before,  let 


let 


H(s)  H(— s)  4 

D(s  j 


N(f)  i 

D(s^) 


2  .  2,2 

S  =-47T  f 


2  2  —  2  —  2  _ 

and  assume  that  the  orders  of  the  polynomials  N(s  ),  D(s  ),  N(s  ),  and  D(s  )  are  2m,  2n,  2m, 

2n,  respectively.  Then  the  maximization  problem  previously  outlined  leads  to  the  boundary 

value  differential  equation  given  by 

(N(p2)  D(p2)  -  X.D(p2)  N(p2)]  v.(t)  =  0  0  <  t  <  J 

v.<0)  =  v.m  =  0 


(n+m-l)(0)  =  (n+m-l)(J)  =  0 

The  corresponding  channel  input  signals  {y^(t)}  are 
y^t)  ^  Q(p)  v.(t) 

\l  C 


(F-l) 


(F-2) 


where 


n+m 

Q(s)  4  n  (s-z.) 
i=l 


z.  =  ±s. 
i  i 


i  =  1,  2,  .  .  .  ,  n  +  m 


s.  -= 

l 


RHP  zeros  of  D(s  ) 
RHP  zeros  of  N(s2) 


for  i  =  1,  .  .  . ,  n 

for  i  =  n  +1,  .  .  . ,  m  +  n 
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and  therefore 


Q(s)  Q(— s)  =  cD(s2)  N(s2) 

These  solutions  have  the  following  properties: 

(1)  The  choice  =  s^  or  z.  =  —  s^  in  the  definition  of  Q(s)  is  arbitrary. 
Thus,  there  are  2n+m  equally  valid  solutions  to  the  maximization 
problem. 

(2)  The  {y.(t)}  are  orthogonal  and  may  be  assumed  normalized,  i.e., 
in  the  notation  of  Sec.  IV-A, 

(Yi*  ^>3-  =  6ij  • 

This  normalization  is  assumed  hereafter. 

(3)  If  K^(t)  is  defined  as 

Ki«>  -£,«m  “ df 


and  if  the  generalized  inner  product  is  defined  as 


g(a)  K1(t  —  a)  da  dt 


then  the  {y..(t)}  are  "doubly  orthogonal"  (in  the  sense  of  Sec.  IV-A)  at 
the  channel  output  with  respect  to  the  generalized  inner  product,  i.e., 
if  ry(t)  is  the  channel  filter  output  when  y^(t  —  jj)  is  transmitted,  then 


(r..,K,r1J  = 

'  ij  1  ki'°° 


[X. 

i 


if  i  =  k  and  j  =  i 


otherwise 


(4)  The  eigenvalues  {X.}  are  the  "generalized  energy  transfer  ratios"  of 
the  filter  for  the  corresponding  eigenfunctions,  i.e.,  with  r^(t)  as 
previously  defined. 


X.  = 
l 


(r.  ,  K.r.  ) 

IQ  1  IQ7** 

(Yi’ Yi>3’ 


The  importance  of  this  result  lies  in  the  fact  that  the  SNR  at  the 
optimum  detector  output  at  t  =  0  is  given  by 

f°  |X(f>  H(f)  |  2 

J_o  N(f> 

when  x(t)  is  the  transmitted  signal.  Thus,  y ^ (t)  is  the  solution  to 
the  original  maximization  problem. 
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